The Lexical Resources Working Group received funding in 2017 for the project “Services development and scalability”. This project aimed to organize two workshops to fine-tune the proposal for TEI-Lex, a baseline TEI schema for encoding dictionaries.
With TEI-Lex, the Working Group aimed to address one of the main challenges facing the creation of interoperable lexical data: the lack of a common target transformation format for:
- comparing existing dictionaries
- serving as an LMF-serialization tool
- creating generic querying and visualization tools
TEI-Lex is not imagined as a one-size-fits-all schema, but rather as a community-agreed solution for tightening representational constraints so that the lexical data contained in the digital editions of existing and future dictionaries can be studied, repurposed and improved upon in various scholarly contexts.
The Lexical Resources Working Group organized two advanced workshops in which existing Working Group members and external experts made a significant push in the creation of TEI Lex-0, a baseline schema for encoding dictionaries. The first workshop “TEI-Lex: From Best-Practice Guidelines to a TEI Schema”, hosted in Berlin on May 2–3, 2018, focused on the analysis of best-practices in the TEI lexicographic community and the challenges of tightening the TEI framework.
The second workshop “TEI-Lex and Beyond: Toward a Lexical Data Seal of Compliance” took place at the Faculty of Arts in Ljubljana, on July 16, 2018. This workshop was co-located with the EURALEX 2018 conference and discussed the role that TEI-Lex could play in the creation of a DARIAH-supported mechanism for recognizing high-quality lexical data.
Both workshops allowed participants to discuss a range of complex encoding issues, explore a variety of lexical resources and analyze numerous best-practice examples stemming from the TEI lexicographic community. The two workshops significantly contributed to the following outputs:
- the creation of a GitHub space for the WG
- TEI Lex–0 documentation
- TEI Lex–0 RelaxNG schema
- two conference papers presented at the TEI2018 Conference in Tokyo
- Toma Tasovac and Laurent Romary “TEI Lex–0: A Target Format for TEI-Encoded Dictionaries and Lexical Resources”
- Jack Bowers, Axel Herold, Laurent Romary, “TEI-Lex0 Etym: Towards Terse Recommendations for the Encoding of Etymological Information”
As a consequence of the important work that was partially funded by the DARIAH WG Funding Call, TEI Lex–0 is already being used across Europe:
- TEI Lex–0 has been officially adopted by the H2020-funded project ELEXIS (European Lexicographic Infrastructure) as a format to be used for collecting lexical data from project partners, observers and other data providers
- TEI Lex–0 has also been successfully presented and tested by participants in the Lexical Data Masterclass 2018, which was held at the Berlin-Brandenburg Academy of Sciences in 3–7
- The editor of the Portuguese Academy of Sciences Dictionary of the Portuguese Language, Ana Salgado, is working on making this major dictionary TEI Lex–0 native
The Working Group managed to take important steps towards the completion of TEI Lex–0 by inviting top experts in lexical data modeling to participate in the workshops and engaging with new to DARIAH audience.
This post is part of the Working Groups Stories series presenting results and outcomes from the Working Group Funding Scheme 2017-2018.