User Tools

Site Tools


gitdox_workflow

This is an old revision of the document!


GitDox is an online XML and spreadsheet editor which uses GitHub for data storage. Coptic Scriptorium currently uses GitDox to transcribe texts and edit them in a spreadsheet.

GitDox is located at https://corpling.uis.georgetown.edu/gitdox/

GitDox and GitHub

GitDox uses GitHub for file storage and versioning. In order to use GitDox, you must first have a GitHub account. You can set up an account at https://github.com/. Contact us at contact@copticscriptorium.org to get access to our GitHub development repositories.

As described below, users commit changes in GitDox directly to GitHub, but the link between GitDox and GitHub is a one-way street: changes from GitDox get committed to GitHub, but no changes made to files on GitHub are reflected in GitDox. Please only edit files created in GitDox on GitDox!

Creating an account

Contact us at contact@copticscriptorium.org to get an account. We suggest you use your GitHub username and password for GitDox as well.

Your GitDox account is linked to your GitHub account. If you change your password on GitHub, please be sure to update your password on GitDox by logging in and choosing admin.

When you log in to GitDox, you see a list of current documents. Use the dropdown menu above the list to display only documents from a certain corpus. You can also use the arrows to the right of each column name to sort by that column.

Documents are assigned to users (as noted in the fifth column). Please only edit documents assigned to you. If you believe a document should be assigned to you but isn't, please contact the person to whom it is currently assigned to confirm that you should be assigned the document before editing it.

The mode column lets you know whether the document is currently in transcription (</>) or spreadsheet (table icon) mode. As of now, we are only using GitDox for transcription.

Creating a new document

Scroll to the bottom of the current document list and select the “new” button.

Enter the name of your file and corpus as well as the repository on GitHub where you want your file to be saved. Please use the main repository for your corpus and not a fork; once we migrate to GitHub, we will no longer use forks. Assign the document to yourself.

Saving and committing

The transcription mode of GitDox has two options for saving work: save and commit. The save button saves changes within GitDox but does not commit those changes to GitHub. The commit button commits changes to GitHub. Because GitDox depends on an internet connection, it's a good idea to save your work frequently while you're working using the save button.

Be sure to commit your changes to GitHub at the end of each day by entering a commit message in the text box by the commit button and clicking the commit button. You should get a red message at the bottom of the page saying your commit was successful, and you can also check GitHub to see that your commit is on the repo.

Transcribing

Follow our diplomatic transcription guidelines. Enter text in a unicode Coptic character set (SCRIPTORIUM uses the Antinoou font and keyboard.) Be sure there is a space between each Coptic bound group. (You do not need to segment all the morphemes. We follow Layton's conventions for segmenting bound groups. For example ⲁϥⲟⲩⲱϣⲃ ⲛⲁϥ, not ⲁ ϥ ⲟⲩⲱϣⲃ ⲛⲁ ϥ, nor ⲁϥⲟⲩⲱϣⲃⲛⲁϥ.

GitDox features syntax highlighting and suggestions. When you open an angle bracket, GitDox suggests tags that are currently available. GitDox will also suggest attributes for tags. Improperly closed tags and other errors are highlighted in red.

Adding metadata

Use the button at the bottom of the page to add metadata. Select metadata fields from the drop-down menu and enter the value. After you click submit, the metadata will appear in a chart on the bottom of the page. You can't edit metadata you've already submitted, but you can delete the entry from the list and re-enter the correct field and information.

Tokenizing

Click the NLP button under the transcription box to automatically tokenize your transcribed text. Correct the tokenization in GitDox.

Processing using the NLP

We expect an automated link between the XML editor, the NLP, and the online spreadsheet to be available in summer 2017. However, as of May 2017, the following steps are necessary to move transcribed texts from the XML editor through the NLP and into the online spreadsheet.

  • After checking tokenization and committing the final text to GitHub, copy the text from GitDox and paste it into the online NLP (https://corpling.uis.georgetown.edu/coptic-nlp/). Make sure you select “from pipes in input” under tokenize.
  • Process the text, copy the resulting SGML, and paste it into a new file (you may use a text editor such as text wrangler or notepad) on your computer. Save the file on your computer.
  • In your file on GitDox, use the mode drop-down menu to change to the spreadsheet mode. A blank spreadsheet should appear.
  • Scroll to the bottom of the page and use the upload function to upload the file you saved on your computer. The spreadsheet should be populated with the information from your file.

Editing annotations in the online spreadsheet

Follow the steps in the Basic Annotation Workflow starting at “Rename existing layers….” The online spreadsheet automatically saves all work.

Tips for working with the online spreadsheet

In progress

gitdox_workflow.1494960619.txt.gz · Last modified: 2017/05/16 12:50 by eplatte