User Tools

Site Tools


corpus_metadata

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
corpus_metadata [2017/06/08 15:44] – created eplattecorpus_metadata [2017/06/08 21:55] (current) – Updates based on CTS and AZ feedback eplatte
Line 1: Line 1:
-Metadata at the corpus level includes the following fields. +Metadata at the corpus level includes the following fields. See our [[annotation_layer_names| annotation layer name documentation]] for document-level metadata.
  
-|annotation| names of anyone involved in annotation of any kind (transcription, annotation, editing, etc.) for all texts in the corpus, separated by commas|+|annotation| names of anyone involved in annotation of any kind (transcription, annotation, editing, etc.) for all texts in the corpus, separated by a comma and a space between the names|
 |corpus| the name of the corpus as it will appear on data.copticscriptorium.org or in ANNIS| |corpus| the name of the corpus as it will appear on data.copticscriptorium.org or in ANNIS|
-|Greek_source| information about source for Greek aligned in Sahidica, currently only used in 1cor corpus+|Greek_source| optional; information about source for Greek aligned text
-|copyright| copyright information, currently only used in machine-annotated Sahidica corpus|+|copyright| optional; copyright information, currently only used in machine-annotated Sahidica corpus|
 |languages| language(s) of texts in the corpus, usually Sahidic Coptic| |languages| language(s) of texts in the corpus, usually Sahidic Coptic|
-|license| optionallicense under which the corpus is published (included when not CS's usual CC-BY, such as for the Sahidica)|+|license| optionallicense under which the corpus is published (included when not Coptic SCRIPTORIUM's usual CC-BY, such as for the Sahidica)|
 |project| name of project supporting the transcription/annotation/publication (e.g., Coptic SCRIPTORIUM, KoMET, etc.)| |project| name of project supporting the transcription/annotation/publication (e.g., Coptic SCRIPTORIUM, KoMET, etc.)|
-|source| optionalsource of material for publication, if applicable, for instance, http://papyri.info for the doc.papyri corpus| +|source| optionalsource of material for publication, if applicable, for instance, http://papyri.info for the doc.papyri corpus| 
-|tagger_version| version of tagger used on corpus, currently used only in machine-annotated corpora| +|tagger_version| optional; version of tagger used on corpus, currently used only in machine-annotated corpora| 
-|tokenizer_version| version of tokenizer used on corpus, currently used on in machine-annotated corpora|+|tokenizer_version| optional; version of tokenizer used on corpus, currently used on in machine-annotated corpora|
 |translation| name of translators for all texts in the corpus, separated by commas| |translation| name of translators for all texts in the corpus, separated by commas|
 |version_date| the most recent date of publication for any text in the corpus in yyyy-mm-dd format| |version_date| the most recent date of publication for any text in the corpus in yyyy-mm-dd format|
 |version_n| updated for any publication of new data for the corpus| |version_n| updated for any publication of new data for the corpus|
corpus_metadata.1496958297.txt.gz · Last modified: 2017/06/08 15:44 by eplatte