This week, a research collaboration I’m in released an initial version of a new online Coptic dictionary. I blogged about it for the project. Even though it’s very preliminary, we’re pretty excited. I want to take a minute here on my personal blog to reflect on the process.
Working collaboratively is hard, working collaboratively across multiple institutions is harder, working collaboratively internationally is even harder, and working collaboratively internationally and in public is even harder than that. This is true even when everyone comes to the table with shared objectives and good will. It’s just hard! I think it’s worth it, though.
For this project, the dictionary incorporates an XML file created by Frank Feder when he was at the Berlin-Brandenburg Academy (BBAW) and colleagues (all of whom are listed on the dictionary’s About page). This work took years, with Frank and colleagues providing translations in each entry for English, French, and German (including page references to the main English, French, and German print dictionaries).
Amir Zeldes of Georgetown U and his student Emma Manning then this spring and summer created a web interface to query the XML file.
One of our discussions has been that this XML lexicon file may not be complete. Should it still go online? The BBAW wishes eventually to incorporate Coptic lexical information into their Thesaurus Linguae Aegyptiae. This is a 20 year project.
Should the BBAW wait until the project concludes to release the lexicon? Or could this initial file be useful to people before that point? The BBAW and those of us in the other partner projects clearly agreed: yes, this material could be useful. It may not be complete, it may not be perfect, but it is useful. Moreover, implementing it may be helpful for future work and development. Seeing how people use an initial implementation can inform later developments.
Part of the goal of releasing the dictionary is also simply to explore what an online dictionary can do. We know that the structure of the data itself (the entries) or the interface for querying the data may not be perfect. Nonetheless, other than the Marcion project, there’s no other place where one can read a text in Coptic and link to a lexicon, or search a Coptic lexicon and then immediately search a digital corpus for that lemma. (And the Marcion project is great, but for sustainability and archiving, an academic project hosted by an institution may survive longer than an individual’s project. Such stability is key for linking different projects.)
Implementing it also shows us the different data models people might use for digital Coptic. We have public lemmatization guidelines, but not everyone uses them. So when you click on a word in our corpora, you might get the response that there are zero entries in the online dictionary. It might be because there are zero entries; it might be because the lemmatization standards differ, and we need to map our lemmas onto the data model used by the BBAW for their lemmas.
In other words, this dictionary is in part a chance to play with the possibilities, even if it is by no means perfect.
If you have suggestions, please take us seriously when we give our contact information for feedback, contributions, and issues. Get in touch! Also know that this is experimental and in process. The source code hasn’t been released, yet, but it will.
I want to end with the issue of credit for labor. As more people than I can attest, giving proper credit for digital labor is an important and potentially sticky issue. Providing detailed credits is important. But questions about which server hosts a joint project (and which URL or domain is used) might still remain. Our dictionary is currently on a Georgetown web server, but the project includes labor from people at the BBAW and Göttingen, as well. And we anticipate it (in this form or another form) to be hosted at the BBAW and elsewhere. It’s been important for us to document the labor and also acknowledge it each time we publicly reference the project.
People often ask me what this kind of collaborative digital work is like. I hope this exploration of decisions and processes has been helpful for understanding this particular outcome as well as this kind of work more generally.