User Tools

Site Tools


basic_annotation_workflow

This is an old revision of the document!


Table of Contents

Basic Annotation Workflow

Text file

Tokenizer

Import into spreadsheet

You now have an Excel file with tokenized morphemes aligned with bound groups, normalized morphemes. (If you are working with a Sahidica document, you may have translations and verses as well; with a diplomatic transcription line breaks and column breaks and other manuscript annotations are aligned.)

Proofread the tokenization of the bound groups. Add or delete rows if necessary. You may wish to use Google Refine.

Normalization

Create a normalized bound group layer

Create a morph layer

Ensuring orig and norm layers are the same span

Part of speech tagging

Language of origin tagger

Metadata

basic_annotation_workflow.1441809804.txt.gz · Last modified: 2015/09/09 08:43 by ctschroeder