Find and download full-text corpus data of American English taken from spoken (transcripts), fiction, popular magazines, newspapers, and academic texts from 1990.
Access and download the complete COCA data sets in three different formats with your current UA NetID.
Find supporting materials for language-related education, research, and technology development by creating and sharing language resources including lexicons, speech files, transcripts, and other text files from 1999 to present.
Registration is required to download any datasets and additional user agreements may be required. Register and create a new account here. When creating a new account, use "University of Arizona, Library System" as the organization and your UA email. You will be authorized by our corpus administrator and receive an email once your UA status is verified.
Some of this data is also available in the Library on DVDs and CD-ROMs (for check-out to use on computers outside the Library). Search for titles in the library's Catalog using Linguistic Data Consortium as the author, or search by known titles.
Main Library | 1510 E. University Blvd.
Tucson, AZ 85721
(520) 621-6442
University Information Security and Privacy
© 2023 The Arizona Board of Regents on behalf of The University of Arizona.