Underwood 2019
Underwood, Ted. Distant Horizons: Digital Evidence and Literary Change. Chicago: University of Chicago Press, 2019.
“Instead of starting with, say, the fre- quency of connective words, quantitative literary research now starts with social evidence about things that really interest read- ers of literature—like audience, genre, character, and gender. The literary meaning of those phenomena comes, in a familiar way, from historically grounded interpretive communities. Numbers enter the picture not as an objective foundation for meaning somewhere outside history but as a way to establish comparative relationships between different parts of the historical record.” (xii)
“Institutions that strive to be unbiased might well choose to avoid machine learning. When we’re reasoning about the past, on the other hand, our aim is usually to acknowledge and ex- plore biases, not to efface them. Understanding the subjective preferences implicit in a particular selection of literary works, for instance, may be exactly the goal of our research. For this kind of project, it is not a problem but a positive advantage that machine learning tends to absorb assumptions latent in the evidence it is trained on. By training models on evidence selected by different people, we can crystallize different social perspectives and com- pare them rigorously to each other.” (xv)
“Perspectival modeling”
“The models created in this book are supervised: that is, they always start from evidence labeled by human readers. But unlike supervised models that try to divine the real author of an anonymous text, perspectival models do not aim simply to reproduce human judgment. They are used instead to measure the parallax between different observers.” (xv)
“Instead of displacing previous scales of literary description, distant reading has the potential to expand the discipline—rather as biochemis- try expanded chemistry toward a larger scale of analysis.” (xviii)
“Quantitative models are no more objective than any other historical interpretation; they are just another way to grapple with the mystery of the human past, which doesn’t become less complex or less perplexing as we back up to take a wider view.” (xix)
Chapter 1
Criticism is seen as a dialectical struggle — “Literary scholars, by contrast, commonly do assume that criti- cal approaches are locked in dialectical struggle. And this as- sumption is not arbitrary: the premise has been correct for much of our history. Critical debates amount to struggles over a scarce resource—readerly attention.” (1)
“Literary scholars tend to feel they are arguing about the redistribution of interpretive emphasis within fixed historical outlines. Implicit in this self-understanding is an assumption that the broad divisions of literary debate are already known: that we are unlikely to discover, for instance, a new genre or period in the archives.” (2)
Example of two paragraphs, shows that general trends (e.g. narrational impersonality and showing) hold true but may be overstated in criticism
“I have been careful throughout this book not to put much evidentiary weight on lists of words. While fixed semantic categories may be useful as loose abbreviations, we cannot trust them to describe cultural phenomena precisely. For trustworthy description, we need some kind of evidence more deeply rooted in historical context.” (16)
“Instead of measuring things, finding patterns, and then finally asking what they mean, we need to start with an interpretive hypothesis (a “meaning” to investigate) and invent a way to test it.” (17)
Machine learning to read difference between fiction and biography
“To put this another way: the direction fic- tion moved from 1750 to 1950 can be concisely described as “away from biography.” It begins to look like the novel steadily special- ized in something that biography (and other forms of nonfiction) could rarely provide: descriptions of bodies, physical actions, and immediate sensory perceptions in a precisely specified place and time.” (26)
“historical interpreters often have to accept that there is no single authoritative account of the past. This can be frustrating—especially when numbers are involved, since we as- sociate numbers with objectivity. But instead of seeking an objec- tive metric to solve the problem, the best course of action is often to consider different perspectives.” (27)
“The temptation to see macroscopic discoveries as less than genuinely new is particularly strong for literary scholars because we are trained to find disciplinary significance only in claims that directly reverse existing expectations. When we are writing about individual books, this is a reasonable goal. A single book is easy to summarize. So if I plan to write about a novel you al- ready read, I need to directly reverse some of your expectations in order to tell you anything new. But when we are surveying three centuries of literary history, descriptive summary is barely pos- sible. Privileging counterintuitive claims here would be as absurd as privileging counterintuitive arguments about climate change. The truth is that we barely have intuitions about patterns on this scale; our expectations are not clearly formed yet, and it would be just as important to confirm them as to confute them.” (32)
Argues that what the numbers show confirms what scholars knew already, and that confirmation is necessary because it adds to the picture
Chapter 2
“The computer knows nothing about literary history: it models only the evidence we give it.This useful blindness will allow us to provisionally bracket twentieth-century science fiction and to model Verne purely by contrasting him to his nineteenth-century contemporaries. Then we can com- pare those models to models of the twentieth-century genre and see how closely their predictions align. In the pages that follow, I will call this method “perspectival modeling.”” (36)
“Instead of being more volatile than communities of reception, textual patterns turn out to be, if anything, more durable.” (40)
“Critics of quantitative approaches to culture often worry, with Martin Jay, that “there is no easy passage from micro- to macroanalysis.”27 That might be true, if macroanalysis meant simply counting words (as Jay assumes). A graph of macroscopic trends in word frequency can’t tell us how the trend might have been produced by changes in individual books or paragraphs. But a predictive model of genre is another matter. Models are inher- ently relational, and one of their strengths is to build bridges between different scales of analysis—allowing us to understand how the historical contrast between two periods was expressed at the scale of the paragraph.” (64)
Another discussion of how numbers are not objective (66-7)
Chapter 3
“This chapter will similarly train models that use textual evidence to predict readers’ responses to literary works.” (68)
“The premise of our inquiry was that the stylistic differentiation caused by a widening social gulf should make literary prestige easier to model.” (71)
“The big picture is that a stylistic stratification of literature is already clear in the middle of the nineteenth cen- tury and then remains fairly stable through the middle of the twentieth.” (71)
“I have modeled literary prestige instead as the probability that an au- thor will be discussed in certain elite periodicals. The assumption underlying this model is that being reviewed indicates a sort of literary distinction, even if your book is panned.” (73)
“Literary judgment is never easy to predict; the models described in this chapter range from only 72.5% to 83% accurate. But the part of reception that can be predicted at all is predicted by models that change relatively little from the 1850s through the 1940s. Moreover, a large component of the change that does take place appears to have a clear long- term social rationale. Poetry and fiction both move steadily in the direction of prevailing critical standards. Poetry apparently moves twice as fast as fiction. There may be a scholar somewhere who expected to see all this, but I confess that I didn’t. I believed the histories that taught me to interpret the last two hundred years as a series of conflicts between roughly generational literary movements, separated by periods of stability.” (105)
Emphasizes at the end of the chapter, again, that critics have focused on generational revolutions, and quant methods can help us see more gradual change
Chapter 4
“Two trends point apparently in opposite directions. The first is that gender divisions between characters have become less predictable. In the middle of the nineteenth century, very dif- ferent language is used to describe fictional men and women. But that difference weakens steadily as we move forward to the pres- ent; the actions and attributes of characters are less clearly sorted into gender categories. This may sound like a progressive story: a character’s role in a narrative is increasingly independent of his or her public gender identity. But the second trend this chapter will describe points in a different direction. If we trace the sheer space on the page allotted to women, we discover a startling de- cline both in the number of characters who are women or girls and in the percentage of a text writers devote to describing them. In short, while gender roles were becoming more flexible, the attention actually devoted to women was declining.” (114)
Points out that book leans heavily on description, not explanation — but description at the scale of centuries is actually difficult (139)
Chapter 5
Covering risks associated with distant reading
“quantitative analysis of large digital libraries requires a massive commitment of time and labor; I would be surprised if even 2% of literary scholars under- took that commitment over the next decade.” (145)
“The reason literary scholars are unlikely to use math in every article is simply that literary schol- arship already excels at its own mission. We probably could use computational analysis to assist every close reading. It wouldn’t be epistemically impossible or ethically perilous to do so. It’s just that we can usually do a better (more vivid, more concise) job on our own. Instead of inventing a stretched story about the dangers of quantification, in other words, I propose to limit the author- ity of numbers in the humanities by remembering to appreciate some things we already do well.” (147)
“Literary scholars do have a special form of knowledge. But the candid way to define its distinctiveness is to say that we have the privilege of focusing on things that are interesting or enjoy- able.” (148)
“Unreflectively turning literary scholarship into social science would be a bad idea. Quantitative arguments can bog down in finicky details, and detailism would aggravate a recent tendency to separate literary history from pleasure. But detailism is not inevitable. At its best, distant reading pushes in the other direc- tion, adding a new liveliness and sweep to historical inquiry.” (150)
“But I am only willing to separate literary history from social science by bluntly emphasizing literary interest and enjoyment. “ (150)