New corpora – release 2.4.0 is out!

We are pleased to announce release version 2.4.0 with new corpora, with tagged and lemmatized corpora available for reading and download at [1], and fully searchable at [2]:

[1] http://data.copticscriptorium.org/

[2] https://corpling.uis.georgetown.edu/annis/scriptorium

This release contains new data contributed by Alin Suciu, David Brakke and Diliana Atanassova, as well as out of copyright edition material contributed by the Marcion project. New data in this release includes excerpts from:

  • The Martyrdom of Saint Victor the General (2033 tokens)
  • The Canons of Apa Johannes (438 tokens)
  • Pseudo-Theophilus On the Cross and The Thief (2814 tokens)
  • Shenoute, Some Kinds of People Sift Dirt (888 tokens)
  • 11 additional Apophthegmata Patrum, bringing the total released to 63 apophthegms (7077 tokens)

All texts are also linked to the Coptic Dictionary Online (https://corpling.uis.georgetown.edu/coptic-dictionary/), which has been updated with frequency information including these texts. We would like to thank the annotators and translators of these data sets, several of whom are new to the project, without whose work the corpora would not be online:

Alexander Turtureanu, Alin Suciu, Amir Zeldes, Caroline T. Schroeder, Christine Luckritz Marquis, Dana Robinson, David Brakke, David Sriboonreuang, Diliana Atanassova, Elizabeth Davidson, Elizabeth Platte, Gianna Zipp, J. Gregory Given, Janet Timbie, Jennifer Quigley, Laura Slaughter, Lauren McDermott, Marina Ghaly, Mitchell Abrams, Paul Lufter, Rebecca Krawiec, Saskia Franck and Tobias Paul

We hope everyone will find this release useful and look forward to releasing more data in the coming year!

 

Leave a Reply

Your email address will not be published. Required fields are marked *

*
*
Website

This site uses Akismet to reduce spam. Learn how your comment data is processed.