Machine Predictions and Synthetic Text: A Roundtable

By Eliza Papaki | News | October 18, 2021

Online event | October 26, 4:30-6:00 pm EST/10:30pm-12:00am CEST

This Roundtable invites two co-authors of the recently published paper “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” to speak with three leading digital humanities scholars about the implications of the article for humanities research employing NLP methods. Together, they will discuss how the authors’ attention to process (data gathering, documentation, standards) and ethics in AI can be turned to humanists creating data and models for the study of literature, history, and culture.

This paper was published in March 2021 and has, since then, sparked impassioned conversations on the unintended consequences and potential harms of prominent natural language processing (NLP) projects. While this groundbreaking paper has been influential in computer and data science—prompting reflection on the dangers of relying on poorly conceptualized and curated data—it is only beginning to be discussed by humanities scholars who use NLP methods in their research.

Panelists

Angelina McMillan-Major (University of Washington, Computational Linguistics)
Gimena del Rio Riande (University of Buenos Aires, Romance Philology)
Lauren Klein (Emory University, English and Quantitative Theory & Methods)
Margaret Mitchell (CEO & Research Scientist, Ethical AI LLC)
Ted Underwood (University of Illinois, Information Science)

Moderator

Toma Tasovac (DARIAH-EU)

This event is part of the ongoing workshop series The New Languages for NLP: Building Linguistic Diversity in the Digital Humanities, held at the Center for Digital Humanities at Princeton and funded by the National Endowment for Humanities. It is co-sponsored by the Center for Statistics and Machine Learning at Princeton and DARIAH-EU.

The language teams of the NLP workshop series were selected from a large field of over eighty-five applications, and chosen for the potential impact of their projects for current speakers as well as scholars studying historical languages. Since June 2021, the nineteen participants are creating linguistic data and trained language models for the following world languages:

Classical Arabic (ٱلْعَرَبِيَّةُ ٱلْفُصْحَىٰ)
Classical Chinese (文言文, funded by the CDH)
Kanbun (寛文)
Kannada(ಕನ್ನಡ)
Ottoman Turkish (لسان عثمانى)
Quechua (Qheswa simi)
Dostoevsky’s Russian (funded by the Canadian Social Sciences and Humanities Research Council)
Tigrinya (ትግርኛ)
Yiddish (ייִדיש)
Yoruba (Èdè Yorùbá)

A trans-Atlantic partnership

DARIAH-EU has signed a Cooperating Partnership agreement with the Center for Digital Humanities at Princeton University in early 2021. The designation of cooperating partner allows the CDH to build on its already impressive range of international activities and expand its relationship with DARIAH itself. Princeton University is DARIAH’s first Cooperating Partner from outside of Europe.

No tags.

Launch of the DARIAH Tools & Services Catalogue

By Amelia McConville

We are delighted to share the launch of the DARIAH Tools and Services catalogue after a lot of work from our team! You can now browse the DARIAH tools and services, access those from SSHRead more
Helsinki Digital Humanities Hackathon #DHH23

By Amelia McConville

Helsinki Digital Humanities Hackathon #DHH23 gathered students and researchers of humanities, social sciences, and computer science in May and June at the University of Helsinki. During a week and a half of intensive multi-disciplinary work,Read more
The ATRIUM project launches its first call of Transnational Access Scheme Grants

By Eliza Papaki

The ATRIUM project launched its first round of calls for the Transnational Access Scheme Grants. ATRIUM’s Transnational Access (TNA) scheme offers researchers the possibility to apply for a fully funded placement at a number ofRead more
Navigating the DARIAH jargon: The DARIAH Glossary

By Eliza Papaki

If you have been wondering what the various acronyms that you find on the DARIAH website and communication channels mean, then the newly published DARIAH Glossary will definitely be a helpful resource! Created by theRead more
DARIAH Working Groups Funding Call 2023: Meet the winning projects

By Eliza Papaki

In late 2023, DARIAH launched the fourth Working Groups (WG) Funding Scheme Call for the years 2023-2025. This scheme is dedicated to – and open only for – the DARIAH Working Groups, and is intendedRead more
OSCARS launches its 1st Open Call for Open Science projects and services

By Eliza Papaki

On March 15th, the Science Clusters launched the first OSCARS cascading-grant call for Open Science projects and services. Over 300 attendees across and beyond Europe joined the online launch event and had the opportunity to interactRead more
DARIAH Signs New Cooperating Partnership Agreement with the University of Leeds

By Eliza Papaki

The Digital Research Infrastructure for the Arts and Humanities (DARIAH-EU) is proud to announce it has signed a Cooperating Partnership agreement with the University of Leeds. DARIAH is a European Research Infrastructure Consortium (ERIC) whoseRead more