News from January 2024

—

The earliest published result from the TOME project—although achieved without the use of distant reading and/or computational analysis—was authored by Petr Pavlas and appeared in the Comenius-Jahrbuch 31. This publication is available for reading and download in PDF format [here].
Vojtěch Kaše and his group developed four word-embedding models for the NOSCEMUS corpus, corresponding to the periods 1501–1550, 1551–1600, 1601–1650, and 1650–1700. These models were trained following the methodology of Sprugnoli et al. (2020), with the addition of the most frequent words from NOSCEMUS to the vocabulary. [The models can be found here]. Notably, these models reveal fascinating shifts in the placement of individual words within semantic space—in a Nominalist sense—of the scientific discourse, illustrating how certain terms move between different semantic clusters and how the relationships between these words and word clusters evolve over time.
Vojtěch Kaše also identified and downloaded the entire Corpus Corporum database, containing at this moment 7,819 Latin texts from various periods, totaling approximately 500 million words.
Jo Hedesan and her team continue to make significant progress in constructing a digital textual corpus of Early Modern Latin Alchemical Prints. Currently, they have successfully identified and meticulously cleaned dozens of works, laying a strong foundation for further research.

Comments