natural_language_processing_service_online
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
natural_language_processing_service_online [2016/08/28 23:12] – created eplatte | natural_language_processing_service_online [2020/08/06 08:33] (current) – amirzeldes | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ===Natural Language Processing (NLP) Pipeline=== | + | ===== Natural Language Processing (NLP) Pipeline ===== |
+ | |||
+ | This page describes how to use the NLP tools on the public Coptic Scriptorium website. | ||
* Access the [[https:// | * Access the [[https:// | ||
- | * Copy the digitized text into NLP Service. | + | * Copy the digitized text into NLP Service |
- | * Be sure "My data contains meaningful linebreaks" | + | * Be sure "My data contains meaningful linebreaks" |
- | The NLP can either tokenize Coptic as part of the entire NLP SGML pipeline (select "SGML pipeline" | + | |
- | * Select “Just piped and dashed morphemes” and run the Service. | + | |
- | * Cut and paste the SGML output into a text file and proofread the automatic tokenization, | + | |
- | * Copy the proofread SGML back into the NLP Service input window. | + | |
- | * Select all annotations desired (usually all except “parse”), | + | |
- | * Copy and convert the SGML output into a multilayer spreadsheet format using the [[import_macro|project’s converter]]. | + | |
- | * Manually proofread and edit data in existing layers. | + | |
- | * Add any missing layers manually or using other existing tools. | + | |
- | * Check layer names to ensure they conform to project standards for the [[annotation_layer_names|data model]]. | + | |
+ | The NLP tools can either tokenize Coptic as part of the entire NLP SGML pipeline (select "SGML pipeline" | ||
+ | - To proofread tokens: Select “Just piped and dashed morphemes” and run the Service. | ||
+ | - Cut and paste the SGML output into a text file and proofread the automatic tokenization, | ||
+ | - Copy the proofread SGML back into the NLP Service input window. | ||
+ | - Select all annotations desired, and run the Service. | ||
+ | - Copy the SGML output for your desired use | ||
+ | * Note: Coptic SCRIPTORIUM annotators can copy the output, paste & save it in a plain text file, and then import that text file in GitDox. In the spreadsheet mode in GitDox there is an option to import SGML. Most annotators will not ever use this feature; do not be concerned if you are not familiar with this feature. |
natural_language_processing_service_online.1472447546.txt.gz · Last modified: 2016/08/28 23:12 by eplatte