Annotation of LSF subtitled videos without a pre-existing dictionary

Lascar, Julie | Gouiffès, Michèle | Braffort, Annelies | Danet, Claire

Volume:: Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources
Venue:: Torino, Italy
Date:: 25 May 2024
Pages:: 100–108
Publisher:: ELRA Language Resources Association (ELRA) and the International Committee on Computational Linguistics (ICCL)
License:: CC BY-NC 4.0
sign-lang ID:: 24012
ACL ID:: 2024.signlang-1.22
ISBN:: 978-2-493814-30-2

Content Categories

Languages:: French Sign Language
Corpora:: Mediapi-RGB

Abstract

This paper proposes a method for the automatic annotation of lexical units in LSF videos, using a subtitled corpus without annotation. This method based on machine learning and involving linguists for added precision and reliability, comprises several stages. The first consists of building a bilingual lexicon (including potential variants of a given lexical unit) in a weakly supervised manner. The resulting lexicon is then refined and cleaned by LSF experts. This data serves next to train a supervised classifier for automatic annotation of lexical units on the Mediapi-RGB corpus. Our Pytorch implementation is publicly available.

Document Download

Paper PDF BibTeX File + Abstract

Cite as

Citation in ACL Citation Format

Julie Lascar, Michèle Gouiffès, Annelies Braffort, Claire Danet. 2024. Annotation of LSF subtitled videos without a pre-existing dictionary. In Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources, pages 100–108, Torino, Italy. ELRA Language Resources Association (ELRA) and the International Committee on Computational Linguistics (ICCL).

BibTeX Export

@inproceedings{lascar:24012:sign-lang:lrec,
  author    = {Lascar, Julie and Gouiff{\`e}s, Mich{\`e}le and Braffort, Annelies and Danet, Claire},
  title     = {Annotation of {LSF} subtitled videos without a pre-existing dictionary},
  pages     = {100--108},
  editor    = {Efthimiou, Eleni and Fotinea, Stavroula-Evita and Hanke, Thomas and Hochgesang, Julie A. and Mesch, Johanna and Schulder, Marc},
  booktitle = {Proceedings of the {LREC-COLING} 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources},
  maintitle = {2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation ({LREC-COLING} 2024)},
  publisher = {{ELRA Language Resources Association (ELRA) and the International Committee on Computational Linguistics (ICCL)}},
  address   = {Torino, Italy},
  day       = {25},
  month     = may,
  year      = {2024},
  isbn      = {978-2-493814-30-2},
  language  = {english},
  url       = {https://www.sign-lang.uni-hamburg.de/lrec/pub/24012.pdf}
}