sign-lang@LREC Anthology

Improving Lemmatisation Consistency without a Phonological Description. The Danish Sign Language Corpus and Dictionary Project.

Troelsgård, Thomas | Kristoffersen, Jette


Volume:
Proceedings of the LREC2018 8th Workshop on the Representation and Processing of Sign Languages: Involving the Language Community
Venue:
Miyazaki, Japan
Date:
12 May 2018
Pages:
195–198
Publisher:
European Language Resources Association (ELRA)
License:
CC BY-NC 4.0
sign-lang ID:
18009
ISBN:
979-10-95546-01-6

Content Categories

Languages:
Danish Sign Language
Corpora:
DTS Corpus
Dictionaries:
DTS Dictionary

Abstract

The Danish Sign Language Corpus and Dictionary project at Centre for Sign Language, UCC has a dual aim: to build of Danish Sign Language Corpus, and to use this corpus to expand and improve The Danish Sign Language Dictionary. Our goal is a one-to-one correspondence between sign lemmas in corpus and dictionary, but due to limited resources, we cannot include an accurate phonological description of each sign form. In order to secure a consistent lemmatisation in the corpus as well as across the two resources, we thus rely exclusively on sign videos and Danish equivalents. In this paper, we will describe how we use the lemmas of the Danish Sign Language Dictionary, and additional signs found in connection with the dictionary work, as the initial lexical database of the corpus tool. For new signs found in corpus, the actual corpus tokens will serve as preliminary video representations. To facilitate the sign search when lemmatising corpus tokens, we assign several Danish equivalents to each sign, including all equivalents in the dictionary data. Furthermore, we include synonyms found through linking these equivalents to the Danish wordnet (DanNet), although equivalents added in this way cannot be regarded as valid senses of the sign.

Keywords

Document Download

Paper PDF BibTeX File+ Abstract

BibTeX Export

@inproceedings{troelsgard:18009:sign-lang:lrec,
  author    = {Troelsg{\aa}rd, Thomas and Kristoffersen, Jette},
  title     = {Improving Lemmatisation Consistency without a Phonological Description. The {Danish} {Sign} {Language} Corpus and Dictionary Project.},
  pages     = {195--198},
  editor    = {Bono, Mayumi and Efthimiou, Eleni and Fotinea, Stavroula-Evita and Hanke, Thomas and Hochgesang, Julie A. and Kristoffersen, Jette and Mesch, Johanna and Osugi, Yutaka},
  booktitle = {Proceedings of the {LREC2018} 8th Workshop on the Representation and Processing of Sign Languages: Involving the Language Community},
  maintitle = {11th International Conference on Language Resources and Evaluation ({LREC} 2018)},
  publisher = {{European Language Resources Association (ELRA)}},
  address   = {Miyazaki, Japan},
  day       = {12},
  month     = may,
  year      = {2018},
  isbn      = {979-10-95546-01-6},
  language  = {english},
  url       = {https://www.sign-lang.uni-hamburg.de/lrec/pub/18009.pdf}
}
Something missing or wrong?