News from April 2024

—

EMLAP Corpus and Transcription Work

The transcription of the EMLAP corpus using TRANSKRIBUS, under close human supervision, is ongoing. The team is focusing on establishing conventions and standards to streamline and improve the transcription process, ensuring consistent and high-quality outputs.

Computational-Historical Advance

The Computational-Historical Group continues to make progress with token-based embeddings; methodologically and generally speaking, the group’s efforts within the CCS-Lab also aim to contribute to the improvement of techniques for analysing linguistic data in historical and cultural contexts.
Vojtěch Kaše now explores methodological opportunities for transitioning between type-based embeddings (i.e. “lemmata embeddings”) and token-based embeddings (i.e. “embeddings of individual instances”). Collaborating with Jana Švadlenková, Vojtěch develops an open, easily accessible and comparatively user-friendly tool for this purpose. Such a tool would allow users from a broader scholarly community (including non-digital and non-computational humanities researchers) to visualise the semantic neigbourhoods and relationships of lemmata from NOSCEMUS, as well as their diachronic development, tailored to the specific inputs and needs of users.

Invited talk at the Stavelot VERITRACE Workshop

Petr Pavlas delivered a talk at the Stavelot VERITRACE internal workshop in early March, invited by Cornelis J. Schilt (Vrije Universiteit Brussel). Petr’s presentation covered some theoretical and intellectual-historical aspects of TOME’s ongoing research, sparking engaging discussions. You can check the event’s programme and read a detailed report on the meeting.

Comments