The Sign Language Dataset Compendium


Corpus

CREAGEST

The CREAGEST corpus is a corpus of adult and child French Sign Language and of natural gestures. It consists of three sub-corpora: a child acquisition dataset, a dataset of dialogues between deaf adults and a dataset of natural gestures. For the acquisition data 65 deaf children and 17 deaf adults were recorded by four deaf investigators. For the dialogue dataset 51 interviews were conducted by four deaf investigators. For the gestural dataset pairs of five hearing-hearing, five Deaf-Deaf and Deaf-hearing individuals were recorded. In total more than 500 hours of over 250 signers have been recorded. The Creagest project was based at the Centre national de la recherche scientifique (CNRS) at the Université Paris 8, ran from 2007–2012 and was led by Christian Cuxac.

To collect French Sign Language production from the children four tasks – free conversations as well as controlled elicitation – were used. For the dialogues between deaf adults semi-directive interviews were conducted, followed by a metalinguistic discussion on the lexical units collected. For the dataset of natural gestures the different pairs were presented to two explanation tasks.

Children were recorded with two cameras, adult interviews with three cameras. No information was found on the recording conditions of the gestural dataset.

Language French Sign Language
Size 500 hours recorded, 300 hours digitized
Participants More than 250 participants
Deaf and hearing
Adults: 18–60 years old
Children: 3–15 years old
From 4 regions
Metadata Format OLAC and IMDI
Translation not available
Annotation not available
~1 hour annotated
Data Format ELAN
Licence CC BY-NC-ND 3.0
Access Access to subset of videos via Ortolang requires registration
Webpages Dataset dialogue: https://www.ortolang.fr/market/corpora/ortolang-000926
Dataset acquistion: https://www.ortolang.fr/market/corpora/ortolang-000916
Institution Centre national de la recherche scientifique (CNRS)

Cite as

Balvet, A., Courtin, C., Boutet, D., Cuxac, C., Fusellier-Souza, I., Garcia, B., L’Huillier, M-T. et Sallandre, M.-A., (2010). The Creagest Project: a Digitized and Annotated Corpus for French Sign Language (LSF) and Natural Gestural Languages. Proceedings of the International Language Resources and Evaluation Conference (LREC'2010), Malte, May 19-21, 2010. 469-475.

Garcia, B., L'Huillier, M.-T. & Sallandre, M.-A. (2013). CREAGEST : enjeux linguistiques, patrimoniaux et socio-éducatifs d’un grand corpus de langue des signes française, La nouvelle revue de l’adaptation et de la scolarisation n° 64, INS HEA, 81-91.

Brigitte Garcia, Marie-Thérèse L'Huillier (2022). CREAGEST - Dialogue entre adultes sourds [Corpus]. ORTOLANG (Open Resources and TOols for LANGuage) - www.ortolang.fr, v1, https://hdl.handle.net/11403/ortolang-000926/v1.

Marie-Thérèse L'Huillier, Marie-Anne Sallandre (2016). CREAGEST - Acquisition [Corpus]. ORTOLANG (Open Resources and TOols for LANGuage) - www.ortolang.fr, v1, https://hdl.handle.net/11403/ortolang-000916/v1.

This entry was last modified on 27 January 2023.