TOME project 3

News from April 2024

  • The transcription work on the TOME corpus using TRANSKRIBUS with human supervision continues by establishing conventions and standards for the practice itself.
  • The computational group is progressing on token-based embeddings and they are testing that approach, at this moment on a Czech corpus of pre-1989 samizdat journals. 
  • Vojtěch Kaše is working on being able to get from type-based embeddings (lemmata embeddings) to token-based embeddings (embeddings of individual instances), and back again. He collaborates with Jana Švadlenková on a public accessible tool making it possible to visualise (in a form of plots) semantic neighbourhood and proximity of the lemmata from NOSCEMUS and their diachronic development according to the concrete needs and inputs of the user.
  • Petr Pavlas gave a talk at the Stavelot VERITRACE internal workshop at the beginning of March (on invitation by Cornelis J. Schilt). Here you can read a report from the meeting.


Leave a Reply

Your email address will not be published. Required fields are marked *