corpus_metadata
Metadata at the corpus level includes the following fields. See our annotation layer name documentation for document-level metadata.
annotation | names of anyone involved in annotation of any kind (transcription, annotation, editing, etc.) for all texts in the corpus, separated by a comma and a space between the names |
corpus | the name of the corpus as it will appear on data.copticscriptorium.org or in ANNIS |
Greek_source | optional; information about source for Greek aligned text |
copyright | optional; copyright information, currently only used in machine-annotated Sahidica corpus |
languages | language(s) of texts in the corpus, usually Sahidic Coptic |
license | optional; license under which the corpus is published (included when not Coptic SCRIPTORIUM's usual CC-BY, such as for the Sahidica) |
project | name of project supporting the transcription/annotation/publication (e.g., Coptic SCRIPTORIUM, KoMET, etc.) |
source | optional; source of material for publication, if applicable, for instance, http://papyri.info for the doc.papyri corpus |
tagger_version | optional; version of tagger used on corpus, currently used only in machine-annotated corpora |
tokenizer_version | optional; version of tokenizer used on corpus, currently used on in machine-annotated corpora |
translation | name of translators for all texts in the corpus, separated by commas |
version_date | the most recent date of publication for any text in the corpus in yyyy-mm-dd format |
version_n | updated for any publication of new data for the corpus |
corpus_metadata.txt · Last modified: 2017/06/08 21:55 by eplatte