The TOROT Treebank

The TOROT Treebank is a dependency treebank with morphosyntactic and information-structure annotation. It includes a large collection of texts in Old Church Slavonic, Old East Slavonic and Middle Russian and is freely available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 License.

The treebank is an expansion of the Slavonic part of the PROIEL corpus and was started as part of the research project Birds and Beasts: Shaping Events in Old Russian, which was financed by the Norwegian Research Council. The treebank is still in active development.

The treebank is released with the official, versioned releases of Syntacticus treebanks on on Github, all of which are treebanks from the PROIEL family. You can also try Syntacticus, an interface for browsing and searching the TOROT and PROIEL treebanks and related treebanks. There are also Universal Dependencies conversions of a selection of the TOROT texts available here and here.

If you use the treebank, please cite as:

Hanne Martine Eckhoff and Aleksandrs Berdicevskis. 2015. 'Linguistics vs. digital editions: The Tromsø Old Russian and OCS Treebank'. Scripta & e-Scripta 14–15, pp. 9-25.

Releases are hosted on Github, where you will also find a complete overview of the texts in the latest release. Earlier releases (before 2023) can be found here.

The Slavonic implementation of the morphosyntactic annotation scheme is described in the document TOROT Guidelines for Annotation, which are wholly compatible with and rely on the PROIEL Guidelines for Annotation.