annotating_sub-word_morphemes
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
annotating_sub-word_morphemes [2015/09/09 07:16] – ctschroeder | annotating_sub-word_morphemes [2015/10/14 12:11] (current) – admin | ||
---|---|---|---|
Line 1: | Line 1: | ||
=== Create a morph layer === | === Create a morph layer === | ||
+ | |||
+ | There are multiple ways of creating a morph layer. | ||
+ | |||
+ | == Follow these steps if you have no morph layer at all == | ||
+ | |||
+ | Duplicate the norm layer. Name the new layer " | ||
+ | |||
+ | Manually or [[google_refine|using Google Refine]] identify the normalized words that need to be annotated on the morph level. | ||
+ | |||
+ | Split the words you have identified into the requisite number of tokens. | ||
+ | * Be sure to break up the words in the **tok** and the **ignore: | ||
+ | * Ensure the norm, orig, norm_group, orig_group, pos, lemma, hi@rend, and other annotation layers stay aligned with the correct tokens. | ||
+ | * Do not break up the words in the norm or orig layers; only the tok and ignore: | ||
+ | |||
+ | Consider using [[google_refine|Google Refine]] to check your work (see if compound words or words containing mnt-, at-, ref-, etc., are still in the ignore: | ||
+ | |||
+ | Complete the steps in the next section to create the morph layer. | ||
+ | |||
+ | == Follow these steps if/when your file has an ignore: | ||
Many of these steps are demonstrated in this [[https:// | Many of these steps are demonstrated in this [[https:// | ||
- | You need to create a clean morph layer that has only unique data in it; 80-90% of the data is identical to the data in norm, making it difficult for a human to see when you’ve got compound words or morphs. | + | You need to create a clean morph layer that has only unique data in it; 80-90% of the data in ignore: |
1. Insert a new column for the morph layer but it should be empty (as in the video) | 1. Insert a new column for the morph layer but it should be empty (as in the video) | ||
Line 12: | Line 31: | ||
3. Select the cell with your formula in it and select the rest of the column down to the end of the layer data. Use the “Edit> | 3. Select the cell with your formula in it and select the rest of the column down to the end of the layer data. Use the “Edit> | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
| | ||
- | Additional steps are demonstrated in the [[https:// |
annotating_sub-word_morphemes.1441804566.txt.gz · Last modified: 2015/09/09 07:16 by ctschroeder