Recent Posts

Coptic Dictionary and ANNIS database down

We are sorry to report that the server that hosts the Coptic Dictionary Online and Coptic Scriptorium’s ANNIS database are down. (Likewise some of the NLP tools and internal tools like GitDox are down.) We are working on fixing the problem, but for now we do not have a timeline for when they will be […]

Coptic Dictionary and ANNIS database down

We are sorry to report that the server that hosts the Coptic Dictionary Online and Coptic Scriptorium’s ANNIS database are down. (Likewise some of the NLP tools and internal tools like GitDox are down.) We are working on fixing the problem, but for now we do not have a timeline for when they will be […]

New Corpora Release 4.3.0

The opening lines of Pistis Sophia It is our pleasure to announce release 4.3.0 of Coptic Scriptorium corpora, which currently cover over 1,175,000 tokens of searchable, linguistically analyzed Coptic data from dozens of ancient Coptic works. New in this release: The History of Eustathius and Theopiste (hagiography, annotations by Lance Martin) Pistis Sophia, book 1 […]

Example of research using the online Coptic Dictionary: standalone G Thomas transcription

Martijn Linssen, an independent researcher, has been working on the Gospel of Thomas for some time and recently published a stand-alone “interactive Coptic-English translation” of the Gospel of Thomas on his Academia.edu site. The Coptic is linked to entries in the online Coptic Dictionary! We invite you to check it out! We are always excited […]

New Corpora Release 4.2.0

It is our pleasure to announce the latest data release from Coptic Scriptorium, version 4.2.0. This release contains both new Coptic material and additions to older datasets, as well as expanding our entity annotations and named-entity linking to all of our data, including the semi-automatically annotated Old Testament. The also means automatic updates to all of our interfaces, […]

Winter 2021 Corpora Release 4.1.0

We are pleased to announce the latest release of data from Coptic Scriptorium, version 4.1.0. The new release adds new Coptic texts and annotation additions, underscored by the application of named and non-named entity annotation to our New Testament corpus. In total, we released approximately 40,000 tokens of manually edited text in 17 documents from […]

Comprehensive Coptic Lexicon v1.2

The “Thesaurus Linguae Aegyptiae” project (“Strukturen und Transformationen des Wortschatzes der ägyptischen Sprache”, BBAW), the “Database and Dictionary of Greek Loanwords in Coptic” (DDGLC, Freie Universität Berlin), and “Coptic Scriptorium: Digital Research in Coptic Language and Literature” are pleased to announce the latest release of the “Comprehensive Coptic Lexicon”: Version 1.2. The raw data can […]

Summer 2020 Corpora Release 4.0.0

Place name index on data.copticscriptorium.org It is our great pleasure to announce the latest release of data from Coptic Scriptorium, version 4.0.0. This release contains both new Coptic material and extensive additions to our suite of tools and annotations, focusing on the addition of support for entity annotation and named-entity linking across our new and […]

Digital Coptic 3 – program online!

The program for the third edition of Digital Coptic is now online. Check out the workshop website for the list of projects, talks and presenters. Please join us for the workshop on July 12 and 13 – participants will receive a Zoom link and password for interactive presentations and discussion, and the workshop will also […]

A bird’s eye view of Coptic entities

Coptic Scriptorium recently annotated its Treebank for entities and will soon use automated tools to annotate all corpora. Entity recognition provides a window into what a text discusses, allowing readers to discover information about people and places of interest found throughout a large number of texts that they could not possibly read exhaustively. The Coptic […]

Entities in the Coptic Treebank

With the release of Version 2.6 of Universal Dependencies, our focus has shifted to handling Named and Non-Named Entity Recognition (NER/NNER) in Coptic data. As a result of intensive work by the Coptic Scriptorium team in the past few months, the development branch of the Treebank now contains complete entity spans and types for the entire data in […]

Universal Dependencies 2.6 released!

Check out the new Universal Dependencies (UD) release V2.6! This is the twelfth release of the annotated treebanks at http://universaldependencies.org/.  The project now covers syntactically annotated corpora in 92 languages, including Coptic. The size of the Coptic Treebank is now around 43,000 words, and growing. For the latest version of the Coptic data, see our development branch here: https://github.com/UniversalDependencies/UD_Coptic-Scriptorium/tree/dev. […]