We are pleased to announce the ATRIUM Summer School on Automatic Text Recognition, taking place from September 1–5, 2025, at the DARIAH Coordination Office in Berlin. This intensive five-day training event is designed for researchers, digital humanists, and cultural heritage professionals who want to explore and implement state-of-the-art OCR (optical character recognition) and HTR (handwritten text recognition) technologies in their projects.
You have automatically extracted text from digitized material, but are not satisfied with the output? You want to improve the accuracy of your transcribed data? You know how to apply an existing model to your documents but want to go further and learn how to fine-tune a model or create one from scratch? You work in a project team that deals with complex data needing particular attention in the Automatic Text Recognition process in order to address a specific difficulty? You want to tackle advanced recognition issues? Then the ATRIUM Summer School on ATR is made for you.
About the Summer School
The ATRIUM Summer School will provide an in-depth approach to automatic text recognition with a focus on practical applications in concrete research scenarios. Participants will gain insights into the latest developments in OCR and HTR, focusing on open-source tools such as eScriptorium and workflows that facilitate the digitization and analysis of historical and modern texts.
During one week, the trainer team will alternate methodological input and supervision of hands-on sessions for the participants to improve their automatic text recognition pipelines. Input will cover not only the manipulation of pre-processing, segmentation, layout analysis, and post-processing, but also data management, empowering participants to achieve concrete goals in terms of the management, processing and reusability of their data within the duration of the summer school and beyond.
What to expect?
- Expert-led lectures and practical sessions on ATR tools and techniques
- Hands-on training with open-source platforms such as eScriptorium
- Case studies showcasing real-world applications in research and cultural heritage
- Opportunities for participants to work on their own projects and receive expert feedback
- Networking with peers and leading scholars in the field
Who should apply?
- Researchers in the humanities and social sciences
- Digital humanists and computational linguists
- Archivists, librarians, and cultural heritage professionals
- Developers and data scientists working with textual data
Basic familiarity with digital humanities and at least some experience with OCR/HTR is required, as we will quickly jump into practice.
We will consider applications from teams of up to two members working on the same project requiring an automatic text recognition pipeline, provided they have well-defined objectives for the summer school week, and a dataset to work on.
Technical Requirements
The ATRIUM Summer School will support you in processing and transcribing your documents. Therefore, it is imperative that you bring your own dataset, such as scanned pages of the documents you want to transcribe. Please ensure that the digitization is of good quality. The dataset should not contain images that are too noisy (avoid blurry images, stains, tears, etc.) as it would severely hinder the recognition process. Finally, although high resolution is not necessary, a minimum resolution of 300 dpi is recommended to ensure recognition by the software.
Participants are expected to bring their own laptop.
Application Procedure
To apply please fill out the application form here until Tuesday, 1st April.
If you have any questions, please contact info@atrium-research.eu .
Important Dates
- Applications open: Tuesday, 4th March 2025
- Deadline for application: Tuesday, 1st April 2025
- Communication of Results: Early May 2025
- Summer School dates: Monday, 1st September – Friday, 5th September 2025
Funding Available
Participation in the ATRIUM Summer School is free of charge. All accepted participants will be entitled to a stipend of up to 1000 Euro for travel, accommodation and food in accordance with the DARIAH travel policy. Successful applicants will receive more information regarding the DARIAH travel policy upon acceptance of their application.
Organising Committee
- Anne Baillot (Instructor)
- Sarah Bénière (Instructor)
- Megan Black (ATRIUM Project Coordinator)
- Floriane Chiffoleau (Instructor)
- David Lassner (Instructor)
- Toma Tasovac (ATRIUM Principle Investigator)
What is ATRIUM?
ATRIUM (Advancing Frontier Research In the Arts and Humanities) is a Horizon Europe project which brings together 30 partners from 14 countries to improve access to digital research infrastructures and foster transdisciplinary knowledge across archaeology, arts, humanities and languages. ATRIUM’s goals are both technical and social: the four-year project will not only expand the availability of digital tools and methods but also provide training on how to combine them with datasets into meaningful and reproducible research processes.
What is DARIAH?
DARIAH (Digital Research Infrastructure for the Arts and Humanities) is a European research infrastructure with a mission to empower research communities with digital methods to create, connect and share knowledge about culture and society. It provides tools, services, and expertise to researchers, helping them manage, share, and analyze digital data in the humanities and cultural heritage fields. DARIAH is also the coordinator of the ATRIUM project.
