Differences

This shows you the differences between two versions of the page.

--- annotation_layer_names [2020/08/06 08:38] – amirzeldes
+++ annotation_layer_names [2022/01/12 09:47] (current) – admin
@@ Line 16: / Line 16: @@
 |lemma | lemma (dictionary head word); annotates on the normalized words ("norm" layer) |
 |note | notes that normally would go in a TEI XML <note note="xxx"> tag |
-|hi@rend | text renderings (see http://www.copticscriptorium.org/download/tools/SCRIPTORIUMDiplTranscriptionGuidelines.pdf sections 4.2 & 5). Use spaces not commas between elements (e.g., red large not red, large); to render spaces, be sure to place underscores in the phrase (e.g., 1_space_right not 1 space right); validates to TEI XML only if there are five or fewer elements |
+|hi@rend | usually appears as hi_rend in the column name in spreadsheet mode; for text renderings (see http://www.copticscriptorium.org/download/tools/SCRIPTORIUMDiplTranscriptionGuidelines.pdf sections 4.2 & 5). Use spaces not commas between elements (e.g., red large not red, large); to render spaces, be sure to place underscores in the phrase (e.g., 1_space_right not 1 space right); validates to TEI XML only if there are five or fewer elements |
 |gap | Annotates for lacunae. Corresponds to the EpiDoc TEI-XML element gap. Uses attributes such as @reason, @unit, @quantity, and @extent. With attributes, each element+attribute annotation generates a new layer in the multi-layer data model |
 |supplied | Annotates for supplied text where text is missing from the original for a variety of reasons. Corresponds to the EpiDoc TEI-XML element supplied. Uses attributes such as @evidence and @reason. With attributes, each element+attribute annotation generates a new layer in the multi-layer data model. |
-|lb@n | line breaks -- numbered according to the original manuscript |
+|lb@n | usually appears as lb_n in column header in spreadsheet mode; line breaks -- numbered according to the original manuscript |
-|cb@n | column breaks -- numbered according to the original manuscript |
+|cb@n | usually appears as cb_n in column header in spreadsheet mode; column breaks -- numbered according to the original manuscript |
 |pb_xml_id | Page numbers of original manuscript (not the current repository numbering); be sure column label does not include a colon (e.g. pb_xml_id not pb_xml:id); be sure page numbers do not include spaces (e.g. EG202 not EG 202) (TEI XML <pb xml:id="xxx">) |
 |ignore:note | notes that will NOT be imported into ANNIS or exported as TEI or PAULA XML; private notations from annotators/encoders/editors |
@@ Line 40: / Line 40: @@
 |entity | one of the ten entity types (e.g. person, place) see [[https://github.com/CopticScriptorium/entity-tagging/raw/master/coptic_scriptorium_entity_guidelines.pdf|entity guidelines]]. Note that there can be multiple entity columns due to nested entities. These columns are typically not edited manually in the spreadsheet, but are added by the graphical entity editing interface in entities mode.|
 |identity | this annotation stores linked entry identifiers for named entities; it is populated automatically during export by GitDox if named entities have been added using the entity annotation interface. Annotators do not need to manually add this column|
+|arabic | Arabic translation. Spans should follow translation and verse layers |
+=== NOT columns in the spreadsheet ===
+The following information should **NOT** be annotated manually in the spreadsheet, as it is added by other processes:
+  * identity - this is the ANNIS annotation corresponding to named entity linking (Wikification). This information comes from the entity identification annotations in entities mode (after clicking "List named entities"). It should not be entered manually into the spreadsheet itself.
+  * func / head - this is syntactic information from automatic or gold parsing. It is never done in spreadsheet mode, but added during publication by an automatic parser, or annotated manually in the Arborator interface (but NOT in GitDox)
+  * multiword - multiword expression annotation is also added automatically during publication based on the currect state of multiword entries in the Coptic Dictionary Online. It is not edited manually and should not be included in the spreadsheet.
 === Metadata ===
@@ Line 46: / Line 54: @@
 (Note: see also the [[corpus_metadata|corpus-level metadata]] documentation for adding metadata for the entire corpus.)
-[[https://github.com/CopticScriptorium/misc-development/raw/master/metadata-checklist-Dec2019.pdf|Download a checklist of the following fields]]
+[[https://github.com/CopticScriptorium/misc-development/raw/master/metadata-checklist-Dec2019.pdf|Download a checklist of the following fields]] (does not yet include arabic_translation and possibly other newer fields; the list below is the most accurate)
 |annotation | names of annotators (transcribers, editors, annotators) in comma delimited sequence |
+|arabic_translation | names of people who translated the text into Arabic in comma-delimited sequence |
 |attributed_author | optional. attributed author of a conceptual work who may or may not be the historical author |
 |author | author of the conceptual work |
@@ Line 60: / Line 69: @@
 |entities| describes whether entity annotation has been reviewed. Available values are automatic, checked, or gold; required|
 |Greek_source | optional, information about the Greek version of the text if it exists (e.g., Greek Alphabetical or Systematic Apophthegmata Patrum) |
-|idetities| describes whether named entity linking has been reviewed. Available values are automatic, checked, or gold; required|
+|identities| describes whether named entity linking has been reviewed. Available values are automatic, checked, or gold; required|
 |idno |catalogue # of the manuscript in the current repository|
 |language | language in which the text is written |
@@ Line 98: / Line 107: @@
 |version@n |version of this Coptic SCRIPTORIUM data|
 |witness        | prose note about parallels; optional|
+=== Automatic metadata ===
+GitDox will automatically generate semi-colon separated lists of named entities in the following metadata fields during export. They will not show up in the GitDox table, and you should not add or edit these manually:
+|people | named people identifiers for people mentioned in the document (separated by "; ") |
+|places | named place identifiers for places mentioned in the document (separated by "; ") |