part_of_speech_tagging_using_tree-tagger
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
part_of_speech_tagging_using_tree-tagger [2015/09/09 06:16] – created ctschroeder | part_of_speech_tagging_using_tree-tagger [2015/09/09 06:30] (current) – ctschroeder | ||
---|---|---|---|
Line 1: | Line 1: | ||
=== Part of speech tagging === | === Part of speech tagging === | ||
- | The norm layer will be tagged with the tree-tagger pos-tagger. | + | The norm layer will be tagged with the tree-tagger pos-tagger. |
- | The ignore: | + | To tag for part of speech, you will need the [[ http:// |
- | |||
- | |||
- | *tree-tagger pos-tagger | ||
Select the norm column, copy it, paste it into a text file. Make sure the text file is using Unix returns and is in UTF-8. | Select the norm column, copy it, paste it into a text file. Make sure the text file is using Unix returns and is in UTF-8. | ||
Save the new text file in whichever directory has the tree-tagger script (for Macs, the tree-tagger-MacOSX-3.2-intel/ | Save the new text file in whichever directory has the tree-tagger script (for Macs, the tree-tagger-MacOSX-3.2-intel/ | ||
Open a terminal window at that directory. | Open a terminal window at that directory. | ||
- | Run the tree-tagger. | + | Run the tree-tagger. |
Open the outputFileName. | Open the outputFileName. | ||
Copy and paste the data into empty columns in the Excel file. | Copy and paste the data into empty columns in the Excel file. | ||
- | If you have morphs, the tree-tagger will not have respected the spans in the norm layer. | + | |
+ | If you have an additional morph annotation layer, the tree-tagger will not have respected the resulting | ||
*select the ORIGINAL norm column (not the one you just pasted in; to be safe, you might rename the new one ignore:norm or something like that). | *select the ORIGINAL norm column (not the one you just pasted in; to be safe, you might rename the new one ignore:norm or something like that). | ||
- | *Click the “unmerge cells” button to unmerge the spans.” | + | *Click the “unmerge cells” button to unmerge the spans. |
*Using the Find function, find the next empty cell. (If the norm layer is selected, it will only find empty cells in that column.) | *Using the Find function, find the next empty cell. (If the norm layer is selected, it will only find empty cells in that column.) | ||
*In the norm column, select the empty cell and the cell above it; merge the two cells. | *In the norm column, select the empty cell and the cell above it; merge the two cells. | ||
Line 23: | Line 21: | ||
*select ignore:norm and delete the column. | *select ignore:norm and delete the column. | ||
- | You can do this without manual proofreading by: | + | You can do this without manual proofreading by pre-processing your data before running the tagger in the following way: |
- | Make a copy of norm in a new sheet | + | *Make a copy of norm in a new sheet |
*Unmerge all spans | *Unmerge all spans | ||
*Add a new column with a serial ID (1, | *Add a new column with a serial ID (1, |
part_of_speech_tagging_using_tree-tagger.txt · Last modified: 2015/09/09 06:30 by ctschroeder