Lev Manovich’s article in DinDH on big social data poses a challenge to DHers to consider the ways in which they (we) can begin to use huge data sets (e.g., Flickr photos or YouTube videos) to deepen and improve our humanities & social science research. LM draws an important distinction between “surface” and “deep” data: an example of the former would be U.S. census data, which offers a once a decade “snapshot” at a macro level of the American population (though, I would suggest, that when you look at census data disaggregated (i.e., on the original enumerated forms filled out by the census takers), you get something closer to “deep” data about individuals); the latter (“deep”) data he illustrates by using a psychologist’s engagement with an individual patient over time to generate a full sense of an individual’s life (not exactly data we have access to, though). He goes on to suggest that the explosion in social media has blurred the boundaries and distinctions between deep and surface data sets.
That said, LM then offers four “objections” to the optimistic view that available social media data will usher in a brave new world of new research vistas. First objection is the obvious fact that large social media companies (e.g., Google, Facebook) limit researchers’ access to their data. Second, LM cautions that we need to be careful about issues of authenticity when we read data over social networks, because so much of what individuals do on social media is performative and thus not necessarily an accurate depiction of their lives. Third, he rejects the current notion that we no longer have to choose between getting and using deep or surface data. LM reminds us that these types of data are, indeed, different and that their uses and purposes can and should vary, depending on the particular research questions researchers decide to ask. He also suggests that we not allow ourselves to be blinded by the sheer size and availability of large data sets in terms of framing (or even worse) limiting the kinds of humanities and social science research questions we choose to ask. And fourth, because many big data questions are technically and organizationally complex and thus difficult to solve, LM notes that a final reason not to be optimistic is that it’s hard for humanists to collaborate with computer scientists, etc., especially across large disciplinary boundaries and silos.
LM notes that these four objections hardly exhaust the possible problems and limitations of big data access (he notes privacy concerns as one big area he hasn’t addressed). That said, he remains, in the conclusion to the piece, optimistic about what can still be done within the limited parameters he describes (“the possibilities are endless”), with the added qualification, “if you know some programming and data analytics and also are open to asking new types of questions about human beings, their social lives, their cultural expressions, and their experiences.” (473)
My final question after finishing reading LM’s very smart article was whether technical limitations on the part of researchers is the biggest problem for DH or whether DHers aren’t yet asking the right “new types of questions?” What do you think?