Skip to main content

Getting intersectional with methodologies: Going reactive, getting archival, getting big, with data

Considering Newer Research Methodologies

Presently most researchers will consider themselves in one preset methodologies set forth decades ago in the mid 2000s. These are usually divided into qualitative, quantitative, or mixed methods.  This is fine and good, and allows researchers to fall back on traditions that have been years in the making. This is how we expand precedent and appeal to previous logic to ground the case that our data collection is sound. However, what about making the case that that it is time for new methodologies more intersectional than mixed? Can we add richness to research methodologies and take on some of the emerging issues in education when we invite transdisciplinary involvement with research data?

One methodology that I have considered recently is traversing qualitative research with content analysis, and digital humanities methods. I argued that starting from the archive, staging, and preliminary analysis, borrowing from data science, gives researchers control over what happens downstream with content analysis in smaller chunks. We live in a time where data can be almost 100% completely captured to examine the entirety of a phenomenon. Take the case of microblogging, or Facebook postings, or other content sourcing. It it possible to collect all document traces related to a phenomenon and capture it and meta tag it as big data sources. This possibility means a need for change in research approaches.

Build the archive

In a movement of large gestures, the researcher is interested in collecting the archive. This means defining the research questions so that the capture is large enough to ask large, sweeping questions that big data can help answer. For example, a researcher might use the question, "What is Facebook user sentiment towards Covid-19?" or, "How does love get defined in the 18th century British novel?" The archive becomes the data that helps build the answers to the questions that you have posed. There are two requirements at this stage of building the archive: (1) The archive should include the whole of the phenomenon, (2) the documents must be scalable with some kind of search aid, (3) The archive must be viewable.


Display the archive

The fact that the archive is built first is not lost on other procedurals coming after. On the contrary, part of its sole purpose is to display what has been gathered of the totality of the phenomenon, or in the case of microblogging or Facebook posts, the discourse. Here is where the archive takes a dramatic turn. For those that would say researchers can physically cut and paste materials from the archive to search it, contour it, and draw conclusions from it, there are more advanced ways of displaying the archive at hand. The dashboard is one such way of presenting advanced ways of seeing the data in the archive. It has only been recently that dashboards have been around and perhaps for the first time that the dashboard has been suggested as a way to build and display the archive. With the dashboard comes the idea that the archive can be built reactively. 

Build a Reactive archive 

One way to build the archive is to do it reactively (Wickham, 2021). To assist with this, it is possible to use shiny apps that can help ease one's way through the data. First, the shiny app represents a webpage that holds the data in either one or more data frames that become accessible to the user on the client side of the page. Underneath the code on the server side, is the interactive code that allows R to react to selections made from drop down menus, slider menus, and so on, so that distinct views can be made of the data. The code below shows how one can create the server side and the reactive input as ways to navigate the webpage on a bi-term topic model as fed to the algorithm by data in the beginning of the function.
Figure 1: The Reactive Archive and the Server

The output is then a series of topics that are selected by the user selecting a number from a slider, resulting in probabilistic terms from topic model showing in a data frame on the screen.

Slicing as showing data and encouraging cutting through data

Shiny apps are a way to communicate results or perspectives clearly to uninitiated or novice R users who require quick snapshots of summarized views into the data or models that have been constructed. In the example here, the archive can be sliced with important views shown in many different ways. For example, perhaps a verbs list has been shown for the most widely used action verbs in the texts, or sentiment analysis has been calculated to determine positive affect in the archive. What we expect is insights coming from different directions, different calculations, all laying bare significant insights on the texts at hand that are not necessarily viewable from the archive, but are viewable with summary dashboards. 

Writing Through the Universe by Sampling the Archive

In this methodology, some quantitative skills are at play. For example, as the Archive is treated as the universe of the phenomenon in play, a randomizer could be called upon to query the archive, which would then allow for conclusions to be made about the data evidence called upon in the randomized evidence. It would be assumed that the archive holds tremendous conclusions and multiple perspectives waiting to be discovered. It is also assumed that some of the conclusions might be at odds with one another, or outright contradicting one another. 

Does Sampling the Archive make it Quantitative?


The immediacy of falling back on what is already known (the qualitative /quantitative debate) might be jarring to those who are comfortably situated in their research tradition. However, the idea of sampling the voicings within an archive would allow for themes to come forth that might otherwise remain dormant. This is not to say that the archive is probabilistic (which it might be); rather, it is open to possibilities for what we might find in it through sampling.

Conclusion: Newer Methodologies for newer data assumptions

In the early to the late 2000s it was common to thematicize a group of qualitative findings after interviewing a handful of participants through qualitative research means and write about it deeply and with thick description. These few accounts would be made to be reachable to readers that found resonance in them. However, as the information age wears on, it is possible to find a totality of the phenomenon expressed. In many cases, these social phenomena can be collected and preserved for the study, and then for the study analysis that would come after, through a combination of digital humanities, content analysis, and qualitative research methodologies. Transecting these methodologies, when appropriate, forges a way forward to create an archive that is displayable, discoverable, and acting as source material for potentially numerous researchers to draw conclusions with either natural language processing methods or with smaller scale qualitative research. 

References


Wickham, H. (2021). Mastering Shiny: build interactive apps, reports, and dashboards powered by r.O'Reilly Media.


Popular posts from this blog

Persisting through reading technical CRAN documentation

 In my pursuit of self learning the R programming language, I have mostly mastered the art of reading through CRAN documentation of R libraries as they are published. I have gone through everything from mediocre to very well documented sheets and anything in between. I am sharing one example of a very good function that was well documented in the 'survey' library by Dr. Thomas Lumley that for some reason I could not process and make work with my data initially. No finger pointing or anything like that here. It was merely my brain not readily able to wrap around the idea that the function passed another function in its arguments.  fig1: the  svyby function in the 'survey' library by Thomas Lumley filled in with variables for my study Readers familiar with base R will be reminded of another function that works similarly called the aggregate  function, which is mirrored by the work of the svyby function, in that both call on data and both call on a function toward...

Striking a Balance: How Much Should I Cast My own Site Variables, Versus Referring to the Literature When Doing Predictive Machine Learning as a Data Scientist?

 Recently I came back from the TAIR 2025 conference and I was struck by the number of presenters that focused on using either auto machine learning or artificial intelligence in creating models for predictive analytics in higher education. One of the striking things about the works presented is that the independent variables were somewhat similar to each other but yet different from each other enough to raise the question. How much should there be consistency between predictive machine learning models? Or, how generalizable should any given model be? These two questions strike at the limits of what local work should aim towards. One way to look at the issue is the pressing need to look at all available variables locally and use them to forage a way forward at predictions about issues like retention, enrollment, and so forth at the university level. To a certain degree this is a moot point, as some would argue that data science is about creating actionable insights.  That is, u...

The Matrix Literature Review and the 'rectangulate' Function from the r7283 Package

Matrices and Literature Reviews Pulling together a strong literature review continues to be the very foundation of  positioning an education researcher's novel contribution to the field. Yet, reviewing literature can be daunting at the outset. This is because organizing the literature review results requires itemizing, tagging, and keeping track of the relevant articles. Organizing 100 + articles takes time, commitment, and can ultimately distract from the task at hand, which is getting a grip on the state of knowledge. To make the task of organizing the literature more straightforward, I have created a computational function that helps lift some of the burden of organizing literature.  It takes an exported bibliographic research file (.bib) exported from EBSCO and widens it into a matrix. Transposing the .bib file into a matrix allows the researcher to jump right into the matrix literature review style of reading articles. A matrix literature function for education ...