Demography of Literary Form: Probabilistic Models for Literary History

Thumbnail Image




Riddell, Allen


Hayles, Katherine

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



Digitization of library collections has made millions of books, newspapers, and academic journal articles accessible. These resources present an opportunity for historians interested in identifying patterns in cultural production that emerge over the space of decades or even centuries. For example, considerable interest has been expressed in studying the emergence, decline, and transmission across national and linguistic boundaries of literary form in the tens of thousands of novels published in Europe in the eighteenth and nineteenth centuries. Navigating such a large collection of texts, however, requires the use of quantitative methods rarely used in literary studies; the single, direct reading of even a thousand texts exceeds the time and resources available to most historians.

This dissertation demonstrates the application of probabilistic model of texts in the study of literary history. The major finding of the dissertation is that regularities previously identified by literary historians can be captured by probabilistic models. Following the first chapter, "How to Read 22,198 Journal Articles: Studying the History of German Studies Using Topic Models," which introduces representations of texts used in the dissertation, chapter 3, "Inferring Novelistic Genre in the English Novel, 1800-1836," and chapter 4, "Networks of Literary Production," illustrate the contribution probabilistic models of novelistic production are positioned to make to long-standing questions in literary history. Both chapters are concerned with the detection and description of empirical regularities in surviving nineteenth-century English novels, such as the recurrence of novelistic genres--e.g., gothic, silver fork, and national tale novels. Chapter 3 makes use of a corpus that includes a random sample of novels published in the British Isles between 1800 and 1836. The use of a random sample and of probabilistic methods, both uncommon in literary studies, serves to develop new conceptual resources for future work in literary history and the sociology of literature.






Riddell, Allen (2013). Demography of Literary Form: Probabilistic Models for Literary History. Dissertation, Duke University. Retrieved from


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.