Annotation of LSF subtitled videos without a pre-existing dictionary
Lascar, Julie | Gouiffès, Michèle | Braffort, Annelies
| Danet, Claire 
- Volume:
- Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources
- Venue:
- Torino, Italy
- Date:
- 25 May 2024
- Pages:
- 100–108
- Publisher:
- ELRA Language Resources Association (ELRA) and the International Committee on Computational Linguistics (ICCL)
- License:
- CC BY-NC 4.0
- sign-lang ID:
- 24012
- ACL ID:
- 2024.signlang-1.22
- ISBN:
- 978-2-493814-30-2
Content Categories
- Languages:
- French Sign Language
- Corpora:
- Mediapi-RGB
Abstract
This paper proposes a method for the automatic annotation of lexical units in LSF videos, using a subtitled corpus without annotation. This method based on machine learning and involving linguists for added precision and reliability, comprises several stages. The first consists of building a bilingual lexicon (including potential variants of a given lexical unit) in a weakly supervised manner. The resulting lexicon is then refined and cleaned by LSF experts. This data serves next to train a supervised classifier for automatic annotation of lexical units on the Mediapi-RGB corpus. Our Pytorch implementation is publicly available.Document Download
Paper PDF BibTeX File + Abstract
Cite as
Citation in ACL Citation Format
Julie Lascar, Michèle Gouiffès, Annelies Braffort, Claire Danet. 2024. Annotation of LSF subtitled videos without a pre-existing dictionary. In Proceedings of the LREC-COLING 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources, pages 100–108, Torino, Italy. ELRA Language Resources Association (ELRA) and the International Committee on Computational Linguistics (ICCL).BibTeX Export
@inproceedings{lascar:24012:sign-lang:lrec, author = {Lascar, Julie and Gouiff{\`e}s, Mich{\`e}le and Braffort, Annelies and Danet, Claire}, title = {Annotation of {LSF} subtitled videos without a pre-existing dictionary}, pages = {100--108}, editor = {Efthimiou, Eleni and Fotinea, Stavroula-Evita and Hanke, Thomas and Hochgesang, Julie A. and Mesch, Johanna and Schulder, Marc}, booktitle = {Proceedings of the {LREC-COLING} 2024 11th Workshop on the Representation and Processing of Sign Languages: Evaluation of Sign Language Resources}, maintitle = {2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation ({LREC-COLING} 2024)}, publisher = {{ELRA Language Resources Association (ELRA) and the International Committee on Computational Linguistics (ICCL)}}, address = {Torino, Italy}, day = {25}, month = may, year = {2024}, isbn = {978-2-493814-30-2}, language = {english}, url = {https://www.sign-lang.uni-hamburg.de/lrec/pub/24012.pdf} }