Skip to main content
The University of Arizona

Alert icon Due to COVID-19, all University Libraries locations are closed. You can chat and email with us Monday through Thursday 7am-7pm, Friday 7am-6pm, Saturday 11am-6pm, and Sunday 1pm-6pm. We are not currently accepting any new hold/pickup or scanning requests for physical items. See details on library changes and support.

Text Mining

This guide provides an introduction to text mining resources and tools.

Your Librarian

Your Librarian

Niamh Wallace's picture
Niamh Wallace
Contact:
Main Library A209
(520) 621-4869
Subjects:Anthropology

What is text mining?

Text mining is a method of turning text into data for computational analysis. It can uncover patterns in large bodies of text (called corpora) that might otherwise be hidden. Source: Underwood, T. (2015). Seven Ways Humanists are Using Computers to Understand Text. The Stone and the Shell.

How do I get started?

In starting a text mining project, think about the following questions:

  • What is your research question?
  • What text(s) do you want to use?
  • Are the texts available in machine-readable form?
  • What is the quality of the texts? Do they need to be corrected/cleaned up?

Find content to mine: library resources

Text and data mining, and systematic downloading, is usually not permitted under most of the Library's license agreements. These are some resources that allow for text mining. For questions about text mining access to other library resources, please contact us.

Find content to mine: other resources

Use text analysis tools

Use text mining tools with corpora included