Spring 2024 Digital Seeds Speaker Series presented by Falvey Library at Villanova University
Leveraging Large Language Models to Unveil Seventeenth-Century Books of Secrets
Please join us on Thursday, April 11, from 12-1 p.m. for a virtual talk by Sarah Lang, Digital Humanities researcher from the Center for Information Modeling, University of Graz in Austria, titled, “Leveraging Large Language Models to Unveil Seventeenth-Century Books of Secrets.”
This talk presents experiments with the semantic enrichment and computational analysis of seventeenth-century books of secrets, a genre that intricately intertwines recipe literature with practical how-to guides. These texts, sometimes characterized by their multi-column print layout, abundant use of alchemical symbols, and historical units of measurement, pose significant challenges for digital transcription and analysis. The current study aims to address these challenges by utilizing Large Language Models (LLMs) to enhance the accuracy and efficacy of a Transkribus transcription model, which, despite its sophistication, struggles with the specialized nature of these historical prints.
The experiment explores the potential of LLMs, including CustomGPTs and ChatGPT, in various subtasks such as layout detection, recipe segmentation, and transcription of alchemical symbols. Initial trials using ChatGPT for transcription have shown encouraging outcomes, suggesting a viable path for generating training data to refine Transkribus models fast. Transkibus, while suboptimal in certain OCR tasks that other software nowadays fares better in, excels in handling historical special characters, a critical aspect for this genre.
The project involves developing an integrated workflow that leverages diverse LLMs for a comprehensive process encompassing layout detection, character transcription, and semantic tagging. Once semantic tagging is done, recipes and ingredients can be extracted and analyzed. A central feature of this workflow will be a human-in-the-loop approach for ensuring the accuracy and fidelity of the semantic enrichments to the original texts. This promises not only to enhance our understanding of seventeenth-century artisanal knowledge but also to contribute significantly to the field of digital humanities by demonstrating the potential of LLMs in historical text analysis and semantic enrichment using an understudied genre.
Sarah Lang has a Doctorate in Philosophy with a major in Digital Humanities and is currently a postdoctoral fellow at the Department “Centre for Information Modelling” at the University of Graz in Austria. After completing undergraduate and graduate degrees in History and Classics (Latin & Greek) in Graz (including an Erasmus stay in Montpellier), she transitioned into the field of Digital Humanities, and has been working on projects in this field since 2016. Lang’s PhD research, at the intersection of Digital Humanities and the early modern history of science, introduces computational methods into the history of alchemy. Her research focuses on decoding cryptographical stylistic devices specific to alchemy (Decknamen) by drawing on the case study of chymist Michael Maier’s (1568-1622) Neo-Latin corpus. Lang’s research was funded by the University of Graz bursary during her PhD (2018-2021) and won the Bader Prize for the History of Science (Austrian Academy of Sciences, 2021) for her PhD thesis.
Mapping the Margins: Gay Travel Guides & the Promise of Digital History
Please join us on Thursday, April 18, from 4-5 p.m. for a virtual talk by Drs. Amanda (Mandy) Regan and Eric Gonzaba titled, “Mapping the Margins: Gay Travel Guides & the Promise of Digital History.”
Professors Amanda Regan and Eric Gonzaba will discuss their NEH- funded project entitled Mapping the Gay Guides. The project utilizes the Damron Address Books, a longtime gay travel guide that began in the mid 1960s. First published in an era when most states banned same-sex intimacy both in public and private spaces, these travel guides helped gays find community spaces that catered to people like themselves. Much like the Green Books of the 1950s and 1960s, which African Americans used to find friendly businesses that would cater to black citizens in the era of Jim Crow apartheid, Damron’s guidebooks aided a generation of queer people in identifying sites of community, pleasure, and politics. Mapping the Gay Guides maps over 100,000 historical listings across all 50 states to understand changes in LGBTQ+ space and culture over half a century. Regan and Gonzaba will explain the importance the gay print culture beginning in the 1960s and the possibilities of understanding queer histories in a different light utilizing this kind of historical data.
Amanda (Mandy) Regan is an Assistant Professor in the Department of History and Geography at Clemson University. She is a historian of the late-19th and 20th centuries and specializes in women and gender as well as digital history. She received her PhD in 2019 from George Mason University where she was a Digital History Fellow at the Roy Rosenzweig Center for History and New Media (RRCHNM). From 2019-2021 she was a Postdoctoral Fellow at Southern Methodist University’s Center for Presidential History. At Clemson, she teaches in the department’s new Digital History Ph.D. program. Currently she is working on two projects. First, she is the co-director of Mapping the Gay Guides an NEH funded digital history project that draws on Bob Damron’s Address Books – a prolific set of travel guides for gay Americans in the last three decades of the 20th century. Second, she is revising a book manuscript entitled Shaping Up: Physical Fitness Initiatives for Women, 1880-1965 which is under contract with the University of Virginia Press.
Eric Gonzaba is an Assistant Professor in the Department of American Studies at California State University, Fullerton. He is a historian of race and sexuality in the United States, particularly focused on nightlife and LGBT cultures. He is the creator of Wearing Gay History, a digital archive and museum that explores global LGBTQ history through t-shirts. From 2021 until 2024, he served as co-chair of the Committee on LGBT History, the oldest LGBTQ historians’ association in the United States, and is the co-chair of the upcoming 2024 Queer History Conference. Gonzaba’s work has previously been supported by grants and fellowships from the University of Pennsylvania, Cornell, the Point Foundation, and the Elton John AIDS Foundation. Gonzaba received his PhD in 2019 from George Mason University, having defended his dissertation just a few days after Mandy.
These ACS-approved events, sponsored by Falvey Library, are free and open to all.
About the Digital Seeds Speakers Series:
The Digital Seeds Speaker Series is a Library funded program that supports the invitation of guest speakers in the digital scholarship community to speak at Falvey Library about their research and/or give a workshop on a topic of their choice. The goal of the speaker series is to provide an opportunity for Villanova faculty, staff, and students to learn more about digital scholarship and research at the intersection of social science, humanities computing, and data science. The lectures are often held in the spring and are open to the public and all Villanova faculty, staff, and students to attend. The series is a great way to make connections, build community, and facilitate conversation.
Learn about past speakers here.
Digital Scholarship at Falvey Library:
Falvey Library’s Digital Scholarship Program supports faculty, students, and staff interested in applying digital methods and tools to their research and teaching. Digital scholarship encompasses a broad range of technologies and research areas, including but not limited to digital mapping (GIS), text and data mining, data visualization, virtual reality, 3D modeling, and digital publishing. We host lectures on digital scholarship topics, partner on digital research projects, and provide a collaborative space for consultations and training.
Learn more about Digital Scholarship here.