Skip to main content
The University of Arizona

Responding to COVID-19: You can chat and email with us daily! Mon-Thu 7am-7pm, Fri 7am-6pm, Sat 11am-6pm, and Sun 1pm-6pm. But due to COVID-19, all University Libraries buildings are currently closed, and we're not accepting any new hold/pickup or scanning requests for physical items. See details on library changes and support.

Text Mining

This guide provides an introduction to text mining resources and tools.

Your Librarian

Your Librarian

Niamh Wallace's picture
Niamh Wallace
Main Library A209
(520) 621-4869

What is text mining?

Text mining is a method of turning text into data for computational analysis. It can uncover patterns in large bodies of text (called corpora) that might otherwise be hidden. Source: Underwood, T. (2015). Seven Ways Humanists are Using Computers to Understand Text. The Stone and the Shell.

How do I get started?

In starting a text mining project, think about the following questions:

  • What is your research question?
  • What text(s) do you want to use?
  • Are the texts available in machine-readable form?
  • What is the quality of the texts? Do they need to be corrected/cleaned up?

Find content to mine: library resources

Text and data mining, and systematic downloading, is usually not permitted under most of the Library's license agreements. These are some resources that allow for text mining. For questions about text mining access to other library resources, please contact us.

Find content to mine: other resources

Use text analysis tools

Use text mining tools with corpora included