Skip to main content


Showing posts from August, 2019

The 'jstor_ocr' function in the 'r7283' package for concatenating ocr and metadata from JSTOR's Data for Research

Digital Text Investigations The digital humanities continues to change the ways in which we draw conclusions about social phenomena. This condition starts from the understanding that for the first time in history, humans can potentially scale the totality of a social phenomenon's appearing. This continuous evolution of study provides new ways to examine data. A key idea in this evolution is the ability to pull together unstructured data and their accompanying metadata as a rejoinder to older forms of content analysis and its related approaches. The JSTOR Data for Research (DfR) arrangement presents such a unique development to work with unstructured data. Subscribers can request large, carefully delineated, corpora for academic investigations. At time of writing there are two options for data requests. The first option allows the subscriber to create search terms, and without a signed contract, scale down the results, and download n-grams (roughly 1-3 combinations are available