basic_annotation_workflow
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revisionLast revisionBoth sides next revision | ||
basic_annotation_workflow [2016/08/28 23:26] – Updating NLP workflow eplatte | basic_annotation_workflow [2020/08/03 16:56] – admin | ||
---|---|---|---|
Line 1: | Line 1: | ||
==== Basic Annotation Workflow ==== | ==== Basic Annotation Workflow ==== | ||
- | [[transcribe_a_text|Text file]] | + | === Transcribe your text === |
+ | |||
+ | Transcribe your text in [[gitdox_workflow|GitDox]]. Alternatively, | ||
At this point, you may follow one of two paths: | At this point, you may follow one of two paths: | ||
Line 11: | Line 13: | ||
=== NLP Service Online Workflow === | === NLP Service Online Workflow === | ||
- | [[natural_language_processing_service_online|Run the NLP Service]] on your transcribed text. | + | [[natural_language_processing_service_online|Run the NLP Service]] on your transcribed text in GitDox. |
+ | * If your text is in a text file, copy and paste it into the GitDox text editor. (See the [[gitdox_workflow|GitDox]] page for more information on using the GitDox text editor.) | ||
+ | * If your text is already transcribed into the GitDox text editor and validated (see [[gitdox_workflow|GitDox]]), | ||
- | //You will generally want to proofread tokenization as part of the NLP Service process | + | You will see an NLP button below the text window. Click it. |
+ | |||
+ | //Note for veteran GitDox users: you do not need to proofread tokenization as part of the NLP Service process. The NLP service works better, now, without tokenizing first.// | ||
[[import_macro|Import the SGML into a spreadsheet.]] | [[import_macro|Import the SGML into a spreadsheet.]] | ||
Rename the existing layers according to the [[annotation_layer_names|Annotation layer names guidelines]]. (Not all layers in the guidelines will exist in your file at this point.) | Rename the existing layers according to the [[annotation_layer_names|Annotation layer names guidelines]]. (Not all layers in the guidelines will exist in your file at this point.) | ||
+ | |||
+ | Remove any redundant columns. These may be hi (keep hi@rend); supplied (keep supplied@reason etc.); gap (keep gap@reason etc.). | ||
+ | |||
+ | Add missing information to existing layers. For instance, replace lb and cb placeholders in lb@n and cb@n columns with line and column numbers from original manuscript. | ||
Note: the following steps are a guide to the kinds of work you will be doing. | Note: the following steps are a guide to the kinds of work you will be doing. | ||
Line 31: | Line 41: | ||
[[create_a_normalized_bound_group_layer|Reconstruct the norm_group layer]]. | [[create_a_normalized_bound_group_layer|Reconstruct the norm_group layer]]. | ||
- | Proofread the part of speech (pos), lemma (lemma), and morpheme (morph) layers. | + | Proofread the part of speech (pos), lemma (lemma), and morpheme (morph) layers. Part of speech and lemma are annotated on the norm level. |
Proofread the language of origin (lang) layer. | Proofread the language of origin (lang) layer. | ||
Line 37: | Line 47: | ||
* Coptic SCRIPTORIUM annotates for language of origin on the **morph** level not the **word (norm)** level. | * Coptic SCRIPTORIUM annotates for language of origin on the **morph** level not the **word (norm)** level. | ||
- | Add translation other layers as necessary following the [[annotation_layer_names|annotation layer names guidelines]]. | + | Add translation, paragraph, and other layers as necessary following the [[annotation_layer_names|annotation layer names guidelines]]. |
Add [[Metadata]]. | Add [[Metadata]]. |
basic_annotation_workflow.txt · Last modified: 2020/08/03 18:08 by admin