How Much Data Is Enough Data? A New Motion Capture Corpus for Probabilistic Sign Language Generation
Klezovich, Anna
| Mesch, Johanna
| Henter, Gustav Eje
| Beskow, Jonas 
- Volume:
- Proceedings of the 15th International Conference on Language Resources and Evaluation (LREC 2026)
- Venue:
- Palma, Mallorca, Spain
- Date:
- 11 to 16 May 2026
- Pages:
- 9549–9558
- Publisher:
- European Language Resources Association (ELRA)
- Licence:
- CC BY-NC 4.0
- DOI:
- 10.63317/5pmyrs7f9o33
- ISBN:
- 978-2-493814-49-4
Abstract
We present a new 4.1 hours long high-quality motion capture sign language dataset for Swedish Sign Language — STS Mocap v1. The dataset consists of high quality multimodal data: body tracked with markers, fingers tracked with Manus Quantum Metagloves, face tracked with iPhone LiveLink app in MetaHuman Animator mode, and corresponding textual sentence translation to spoken Swedish. With the help of this dataset, we show that four hours of motion capture data is enough for generative modeling of sign language conditioned on 2D pose. In comparison, training the same flow-matching model on only 30 minutes of this data, which is a common size for sign language motion capture datasets, shows a significant degradation in the quality of the synthesized data.Document Download
Paper PDF BibTeX File + Abstract
Cite as
Citation in ACL Citation Format
Anna Klezovich, Johanna Mesch, Gustav Eje Henter, Jonas Beskow. 2026. How Much Data Is Enough Data? A New Motion Capture Corpus for Probabilistic Sign Language Generation. In Proceedings of the 15th International Conference on Language Resources and Evaluation (LREC 2026), pages 9549–9558, Palma, Mallorca, Spain. European Language Resources Association (ELRA).BibTeX Export
@inproceedings{klezovich-etal-2026-enough:lrec,
author = {Klezovich, Anna and Mesch, Johanna and Henter, Gustav Eje and Beskow, Jonas},
title = {How Much Data Is Enough Data? A New Motion Capture Corpus for Probabilistic Sign Language Generation},
pages = {9549--9558},
editor = {Piperidis, Stelios and Bel, N{\'u}ria and van den Heuvel, Henk and Ide, Nancy and Krek, Simon and Toral, Antonio},
booktitle = {15th International Conference on Language Resources and Evaluation ({LREC} 2026)},
publisher = {{European Language Resources Association (ELRA)}},
address = {Palma, Mallorca, Spain},
day = {11--16},
month = may,
year = {2026},
isbn = {978-2-493814-49-4},
language = {english},
url = {https://lrec.elra.info/lrec2026-main-750},
doi = {10.63317/5pmyrs7f9o33}
}