HUMA 5632 Digital Humanities Seminar - [HUMA 5632 Digital Humanities Seminar] Open Workshop 2: Fine-tuning large-language models for humanities research

01:30pm - 02:50pm
Room 5566 , Room 5566 (Lifts 27-28), Academic Building

Supporting the below United Nations Sustainable Development Goals:支持以下聯合國可持續發展目標:支持以下联合国可持续发展目标:

Abstract

This workshop introduces the concept of fine-tuning large-language models (LLMs) and explores their potential applications in the humanities. This workshop begins by outlining the basic principles of fine-tuning and highlight key technical considerations, such as dataset preparation, parameter selection, and evaluation strategies. To illustrate these ideas in practice, we present an ongoing project that fine-tunes a vision–language model for Manchu optical character recognition (OCR). This case study demonstrates how adapting LLMs to specialized historical sources can unlock new possibilities for text analysis, digitization, and multilingual research. By bridging technical workflows with humanistic inquiry, the workshop shows how fine-tuned models can empower scholars to engage with rare and complex sources in innovative ways.

 

Biography

Dr. Donghyeok Choi is a Postdoctoral Fellow in the Department of History at Hong Kong Baptist University. He received his Ph.D. in Culture Technology from KAIST with a dissertation on bureaucratic success and intergenerational mobility in the Joseon dynasty. His research lies at the intersection of digital humanities, quantitative history, and AI-driven analysis, with a particular focus on applying large-language models to East Asian historical sources. His work spans studies of historical governance in Korea and the development of AI methods that support data-driven research in the humanities.

Event Format
Recommended For
UG students
PG students
Faculty and staff
Organizer
Division of Humanities
Contact
Post an event
Campus organizations are invited to add their events to the calendar.