Clustering Running Titles to Understand the Printing of Early Modern Books

Vogler, Nikolai; Goyal, Kartik; Lemley, Samuel V.; Schuldt, D. J.; Warren, Christopher N.; G'Sell, Max; Berg-Kirkpatrick, Taylor

Computer Science > Digital Libraries

arXiv:2405.00752 (cs)

[Submitted on 1 May 2024]

Title:Clustering Running Titles to Understand the Printing of Early Modern Books

Authors:Nikolai Vogler, Kartik Goyal, Samuel V. Lemley, D.J. Schuldt, Christopher N. Warren, Max G'Sell, Taylor Berg-Kirkpatrick

View PDF HTML (experimental)

Abstract:We propose a novel computational approach to automatically analyze the physical process behind printing of early modern letterpress books via clustering the running titles found at the top of their pages. Specifically, we design and compare custom neural and feature-based kernels for computing pairwise visual similarity of a scanned document's running titles and cluster the titles in order to track any deviations from the expected pattern of a book's printing. Unlike body text which must be reset for every page, the running titles are one of the static type elements in a skeleton forme i.e. the frame used to print each side of a sheet of paper, and were often re-used during a book's printing. To evaluate the effectiveness of our approach, we manually annotate the running title clusters on about 1600 pages across 8 early modern books of varying size and formats. Our method can detect potential deviation from the expected patterns of such skeleton formes, which helps bibliographers understand the phenomena associated with a text's transmission, such as censorship. We also validate our results against a manual bibliographic analysis of a counterfeit early edition of Thomas Hobbes' Leviathan (1651).

Comments:	Accepted at ICDAR 2024
Subjects:	Digital Libraries (cs.DL)
Cite as:	arXiv:2405.00752 [cs.DL]
	(or arXiv:2405.00752v1 [cs.DL] for this version)
	https://doi.org/10.48550/arXiv.2405.00752

Submission history

From: Nikolai Vogler [view email]
[v1] Wed, 1 May 2024 06:18:01 UTC (6,163 KB)

Computer Science > Digital Libraries

Title:Clustering Running Titles to Understand the Printing of Early Modern Books

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Digital Libraries

Title:Clustering Running Titles to Understand the Printing of Early Modern Books

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators