User Tools

Site Tools


corpus_metadata

Metadata at the corpus level includes the following fields. See our annotation layer name documentation for document-level metadata.

annotation names of anyone involved in annotation of any kind (transcription, annotation, editing, etc.) for all texts in the corpus, separated by a comma and a space between the names
corpus the name of the corpus as it will appear on data.copticscriptorium.org or in ANNIS
Greek_source optional; information about source for Greek aligned text
copyright optional; copyright information, currently only used in machine-annotated Sahidica corpus
languages language(s) of texts in the corpus, usually Sahidic Coptic
license optional; license under which the corpus is published (included when not Coptic SCRIPTORIUM's usual CC-BY, such as for the Sahidica)
project name of project supporting the transcription/annotation/publication (e.g., Coptic SCRIPTORIUM, KoMET, etc.)
source optional; source of material for publication, if applicable, for instance, http://papyri.info for the doc.papyri corpus
tagger_version optional; version of tagger used on corpus, currently used only in machine-annotated corpora
tokenizer_version optional; version of tokenizer used on corpus, currently used on in machine-annotated corpora
translation name of translators for all texts in the corpus, separated by commas
version_date the most recent date of publication for any text in the corpus in yyyy-mm-dd format
version_n updated for any publication of new data for the corpus
corpus_metadata.txt · Last modified: 2017/06/08 21:55 by eplatte