The Danish Sign Language Corpus and Dictionary project at Centre for Sign Language, UCC has a dual aim: to build of Danish Sign Language Corpus, and to use this corpus to expand and improve The Danish Sign Language Dictionary. Our goal is a one-to-one correspondence between sign lemmas in corpus and dictionary, but due to limited resources, we cannot include an accurate phonological description of each sign form. In order to secure a consistent lemmatisation in the corpus as well as across the two resources, we thus rely exclusively on sign videos and Danish equivalents. In this paper, we will describe how we use the lemmas of the Danish Sign Language Dictionary, and additional signs found in connection with the dictionary work, as the initial lexical database of the corpus tool. For new signs found in corpus, the actual corpus tokens will serve as preliminary video representations. To facilitate the sign search when lemmatising corpus tokens, we assign several Danish equivalents to each sign, including all equivalents in the dictionary data. Furthermore, we include synonyms found through linking these equivalents to the Danish wordnet (DanNet), although equivalents added in this way cannot be regarded as valid senses of the sign.
Keywords
Experiences in building sign language corpora
Linking corpora and lexicons and integrated presentation of corpus and dictionary contents
@inproceedings{troelsgard:18009:sign-lang:lrec,
author = {Troelsg{\aa}rd, Thomas and Kristoffersen, Jette},
title = {Improving Lemmatisation Consistency without a Phonological Description. The {Danish} {Sign} {Language} Corpus and Dictionary Project.},
pages = {195--198},
editor = {Bono, Mayumi and Efthimiou, Eleni and Fotinea, Stavroula-Evita and Hanke, Thomas and Hochgesang, Julie A. and Kristoffersen, Jette and Mesch, Johanna and Osugi, Yutaka},
booktitle = {Proceedings of the {LREC2018} 8th Workshop on the Representation and Processing of Sign Languages: Involving the Language Community},
maintitle = {11th International Conference on Language Resources and Evaluation ({LREC} 2018)},
publisher = {{European Language Resources Association (ELRA)}},
address = {Miyazaki, Japan},
day = {12},
month = may,
year = {2018},
isbn = {979-10-95546-01-6},
language = {english},
url = {https://www.sign-lang.uni-hamburg.de/lrec/pub/18009.pdf}
}