Skip to main content

Getting intersectional with methodologies: Going reactive, getting archival, getting big, with data

Considering Newer Research Methodologies

Presently most researchers will consider themselves in one preset methodologies set forth decades ago in the mid 2000s. These are usually divided into qualitative, quantitative, or mixed methods.  This is fine and good, and allows researchers to fall back on traditions that have been years in the making. This is how we expand precedent and appeal to previous logic to ground the case that our data collection is sound. However, what about making the case that that it is time for new methodologies more intersectional than mixed? Can we add richness to research methodologies and take on some of the emerging issues in education when we invite transdisciplinary involvement with research data?

One methodology that I have considered recently is traversing qualitative research with content analysis, and digital humanities methods. I argued that starting from the archive, staging, and preliminary analysis, borrowing from data science, gives researchers control over what happens downstream with content analysis in smaller chunks. We live in a time where data can be almost 100% completely captured to examine the entirety of a phenomenon. Take the case of microblogging, or Facebook postings, or other content sourcing. It it possible to collect all document traces related to a phenomenon and capture it and meta tag it as big data sources. This possibility means a need for change in research approaches.

Build the archive

In a movement of large gestures, the researcher is interested in collecting the archive. This means defining the research questions so that the capture is large enough to ask large, sweeping questions that big data can help answer. For example, a researcher might use the question, "What is Facebook user sentiment towards Covid-19?" or, "How does love get defined in the 18th century British novel?" The archive becomes the data that helps build the answers to the questions that you have posed. There are two requirements at this stage of building the archive: (1) The archive should include the whole of the phenomenon, (2) the documents must be scalable with some kind of search aid, (3) The archive must be viewable.

Display the archive

The fact that the archive is built first is not lost on other procedurals coming after. On the contrary, part of its sole purpose is to display what has been gathered of the totality of the phenomenon, or in the case of microblogging or Facebook posts, the discourse. Here is where the archive takes a dramatic turn. For those that would say researchers can physically cut and paste materials from the archive to search it, contour it, and draw conclusions from it, there are more advanced ways of displaying the archive at hand. The dashboard is one such way of presenting advanced ways of seeing the data in the archive. It has only been recently that dashboards have been around and perhaps for the first time that the dashboard has been suggested as a way to build and display the archive. With the dashboard comes the idea that the archive can be built reactively. 

Build a Reactive archive 

One way to build the archive is to do it reactively (Wickham, 2021). To assist with this, it is possible to use shiny apps that can help ease one's way through the data. First, the shiny app represents a webpage that holds the data in either one or more data frames that become accessible to the user on the client side of the page. Underneath the code on the server side, is the interactive code that allows R to react to selections made from drop down menus, slider menus, and so on, so that distinct views can be made of the data. The code below shows how one can create the server side and the reactive input as ways to navigate the webpage on a bi-term topic model as fed to the algorithm by data in the beginning of the function.
Figure 1: The Reactive Archive and the Server

The output is then a series of topics that are selected by the user selecting a number from a slider, resulting in probabilistic terms from topic model showing in a data frame on the screen.

Slicing as showing data and encouraging cutting through data

Shiny apps are a way to communicate results or perspectives clearly to uninitiated or novice R users who require quick snapshots of summarized views into the data or models that have been constructed. In the example here, the archive can be sliced with important views shown in many different ways. For example, perhaps a verbs list has been shown for the most widely used action verbs in the texts, or sentiment analysis has been calculated to determine positive affect in the archive. What we expect is insights coming from different directions, different calculations, all laying bare significant insights on the texts at hand that are not necessarily viewable from the archive, but are viewable with summary dashboards. 

Writing Through the Universe by Sampling the Archive

In this methodology, some quantitative skills are at play. For example, as the Archive is treated as the universe of the phenomenon in play, a randomizer could be called upon to query the archive, which would then allow for conclusions to be made about the data evidence called upon in the randomized evidence. It would be assumed that the archive holds tremendous conclusions and multiple perspectives waiting to be discovered. It is also assumed that some of the conclusions might be at odds with one another, or outright contradicting one another. 

Does Sampling the Archive make it Quantitative?

The immediacy of falling back on what is already known (the qualitative /quantitative debate) might be jarring to those who are comfortably situated in their research tradition. However, the idea of sampling the voicings within an archive would allow for themes to come forth that might otherwise remain dormant. This is not to say that the archive is probabilistic (which it might be); rather, it is open to possibilities for what we might find in it through sampling.

Conclusion: Newer Methodologies for newer data assumptions

In the early to the late 2000s it was common to thematicize a group of qualitative findings after interviewing a handful of participants through qualitative research means and write about it deeply and with thick description. These few accounts would be made to be reachable to readers that found resonance in them. However, as the information age wears on, it is possible to find a totality of the phenomenon expressed. In many cases, these social phenomena can be collected and preserved for the study, and then for the study analysis that would come after, through a combination of digital humanities, content analysis, and qualitative research methodologies. Transecting these methodologies, when appropriate, forges a way forward to create an archive that is displayable, discoverable, and acting as source material for potentially numerous researchers to draw conclusions with either natural language processing methods or with smaller scale qualitative research. 


Wickham, H. (2021). Mastering Shiny: build interactive apps, reports, and dashboards powered by r.O'Reilly Media.

Popular posts from this blog

Digital Humanities Methods in Educational Research

Digital Humanities based education Research This is a backpost from 2017. During that year, I presented my latest work at the 2017  SERA conference in Division II (Instruction, Cognition, and Learning). The title of my paper was "A Return to the Pahl (1978) School Leavers Study: A Distanced Reading Analysis." There are several motivations behind this study, including Cheon et al. (2013) from my alma mater .   This paper accomplished two objectives. First, I engaged previous claims made about the United States' equivalent of high school graduates on the Isle of Sheppey, UK, in the late 1970s. Second, I used emerging digital methods to arrive at conclusions about relationships between unemployment, participants' feelings about their  (then) current selves, their possible selves, and their  educational accomplishm ents. I n the image to the left I show a Ward Hierarchical Cluster reflecting the stylometrics of 153 essay

Creating Examination Question Banks for ESL Civics Students based on U.S. Form M-638

R and Latex Code in the Service of Exam Questions   The following webpage is under development and will grow with more information. The author abides by the GPL (>= 2) license provided by the "ProfessR" package by showing basic code, but not altering it. The code that is provided here is governed by the MIT license, copyright 2018, while respecting the GPL (>=2) license. Rationale Apart from the limited choices of open sourced, online curriculum building for adult ESL students (viz., there is a current need to create open-sourced assessments for various levels of student understandings of the English language. While the U.S. Citizenship and Immigration Services ( has valuable lessons for beginning and intermediate ESL civics learners, there exists a need to provide more robust assessments, especially for individuals repeating ESL-based civics courses. This is because the risks and efforts involved in applying for U

Getting past the two column PDF to extract text into RQDA: Literature reviews await

One of the great promises of working with RQDA is conceiving of it as computer assisted literature review software. This requires balancing the right amount of coding with text that can be used as warrants and backing in arguments. In theory it is a great idea--using computer assisted qualitative data analysis software (CAQDAS) for literature reviews, but how do you get the article PDFs into R and RQDA in a human readable format? By this I mean that many empirical articles are written in two column formats, and text extraction with standard tools produces text on the diagonal. Extracting PDF texts under this circumstance can be daunting when using some R packages such as 'pdftools', either with or without the assistance of  the 'tesseract' package. If you are working on a windows based computer, you can install three packages and Java to do the trick. First gather the literature articles that you would like to mark up in RQDA. Put them into a folder, and away you go.