The annotation tool ELAN was enhanced within the Corpus NGT project by a number of new and improved functions. Most of these functions were not specific for working with sign language video data, and can readily be used for other annotation purposes as well. Their direct utility for working with large amounts of annotation files during the development and use of the Corpus NGT project is what unites the various functions. The following functions appeared in a series of releases between versions 2.6 and 3.4:
The ‘duplicate annotation’ function was created to facilitate the glossing of two-handed signs in cases where there are separate tiers for the left and the right hand: copying an annotation to another tier saves annotators quite some time, and prevents misspellings.
A ‘multiple file search’ was implemented: structured searches combining search criteria on different tiers can be carried out in a subset of files that can be created by the user.
The segmentation function was further developed so that annotations with a fixed, user definable duration can be created by a single key stroke while the media files are playing. The key stroke can either mark the beginning of an annotation or the end.
A function has been added to flexibly generate annotation content based on a user definable prefix and an index number.
A panel can be displayed that lists basic statistics for all tiers in an annotation document: the number of annotations, the minimum, maximum, average, median and total annotation duration per tier. This helps the user getting a better grip on the content in an annotation document and can be helpful in data analysis.
The annotation density viewer can now also be set to only show the distribution of annotations of a single, selectable tier. The label of a tier in the timeline viewer can optionally show the current number of annotations on that tier.
The property ‘annotator’ has been added in the specification of tiers, allowing groups of researchers to separate which tier has been filled by whom.
Export a list of unique annotation values or a list of unique words from multiple annotation documents.
Easy, interactive hiding and showing of any of the associated video files, without having to remove the media file association altogether.
In addition, a large number of user interface improvements have been implemented, including the following:
Improved, more intuitive layout of the main menu bar
Additional keyboard shortcuts; the list of shortcuts can be printed
A recent files list has been added
Easy keyboard navigation through the opened documents/windows
A subtle change in the background of the timeline viewer, facilitating the perception of the distinction between the different tiers
With the use of a new preferences system in version 3, users can now set the colour of tier labels in the timeline viewer, allowing the visual grouping of related tiers in documents containing many tiers.
Although enhanced search functionalities and templates facilitate working with multiple ELAN documents, it is not yet possible to ‘manage’ a set of ELAN files systematically in any way. Perl scripts were developed in order to add tiers and linguistic types to a set of documents, to change annotation values in multiple documents, and to generate ELAN and preferences files on the basis of a set of media files and existent annotation and preferences files. Future collaboration between the ELAN developers at the Max Planck Institute for Psycholinguistics and the sign language researchers at Radboud University will be targeted at enhancing search facilities and facilitating team work between researchers using large language corpora containing ELAN documents.
@inproceedings{crasborn:08022:sign-lang:lrec,
author = {Crasborn, Onno and Sloetjes, Han},
title = {Enhanced {ELAN} functionality for sign language corpora},
pages = {39--43},
editor = {Crasborn, Onno and Efthimiou, Eleni and Hanke, Thomas and Thoutenhoofd, Ernst D. and Zwitserlood, Inge},
booktitle = {Proceedings of the {LREC2008} 3rd Workshop on the Representation and Processing of Sign Languages: Construction and Exploitation of Sign Language Corpora},
maintitle = {6th International Conference on Language Resources and Evaluation ({LREC} 2008)},
publisher = {{European Language Resources Association (ELRA)}},
address = {Marrakech, Morocco},
day = {1},
month = jun,
year = {2008},
language = {english},
url = {https://www.sign-lang.uni-hamburg.de/lrec/pub/08022.pdf}
}