Challenges in Development of the American Sign Language Lexicon Video Dataset (ASLLVD) Corpus

Neidle, Carol | Thangali, Ashwin | Sclaroff, Stan

Volume:: Proceedings of the LREC2012 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon
Venue:: Istanbul, Turkey
Date:: 27 May 2012
Pages:: 143–150
Publisher:: European Language Resources Association (ELRA)
Licence:: CC BY-NC 4.0
sign-lang ID:: 12011

Content Categories

Languages:: American Sign Language
Lexical Databases:: ASLLVD

Abstract

The American Sign Language Lexicon Video Dataset (ASLLVD) consists of videos of >3,300 ASL signs in citation form, each produced by 1-6 native ASL signers, for a total of almost 9,800 tokens. This dataset, including multiple synchronized videos showing the signing from different angles, will be shared publicly once the linguistic annotations and verifications are complete. Linguistic annotations include gloss labels, sign start and end time codes, start and end handshape labels for both hands, morphological and articulatory classifications of sign type. For compound signs, the dataset includes annotations for each morpheme. To facilitate computer vision-based sign language recognition, the dataset also includes numeric ID labels for sign variants, video sequences in uncompressed-raw format, camera calibration sequences, and software for skin region extraction. We discuss here some of the challenges involved in the linguistic annotations and categorizations. We also report an example computer vision application that leverages the ASLLVD: the formulation employs a HandShapes Bayesian Network (HSBN), which models the transition probabilities between start and end handshapes in monomorphemic lexical signs. Further details and statistics for the ASLLVD dataset, as well as information about annotation conventions, are available from http://www.bu.edu/asllrp/lexicon.

Document Download

Paper PDF Poster BibTeX File + Abstract

Cite as

Citation in ACL Citation Format

Carol Neidle, Ashwin Thangali, Stan Sclaroff. 2012. Challenges in Development of the American Sign Language Lexicon Video Dataset (ASLLVD) Corpus. In Proceedings of the LREC2012 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon, pages 143–150, Istanbul, Turkey. European Language Resources Association (ELRA).

BibTeX Export

@inproceedings{neidle:12011:sign-lang:lrec,
  author    = {Neidle, Carol and Thangali, Ashwin and Sclaroff, Stan},
  title     = {Challenges in Development of the {American} {Sign} {Language} Lexicon Video Dataset ({ASLLVD}) Corpus},
  pages     = {143--150},
  editor    = {Crasborn, Onno and Efthimiou, Eleni and Fotinea, Stavroula-Evita and Hanke, Thomas and Kristoffersen, Jette and Mesch, Johanna},
  booktitle = {Proceedings of the {LREC2012} 5th Workshop on the Representation and Processing of Sign Languages: Interactions between Corpus and Lexicon},
  maintitle = {8th International Conference on Language Resources and Evaluation ({LREC} 2012)},
  publisher = {{European Language Resources Association (ELRA)}},
  address   = {Istanbul, Turkey},
  day       = {27},
  month     = may,
  year      = {2012},
  language  = {english},
  url       = {https://www.sign-lang.uni-hamburg.de/lrec/pub/12011.html}
}