The Sign Language Dataset Compendium


Corpus

Catalan Sign Language Corpus

The Catalan Sign Language Corpus (LSC CORPUS) is an annotated corpus of LSCCatalan Sign Language. Recordings were made in twelve deaf clubs around Catalonia. The participants were recorded in pairs, sitting opposite each other in front of a blue background. The signing is captured in a frontal perspective using two HD cameras. A Deaf signer led the participants through the recordings.

The corpus project started in late 2012 with a two-year pilot phase (Barberà et al., 2015) which then led to a long-term project that is still ongoing. The first public version of the corpus was released in 2025. The annotations of the LSC Corpus are associated with a lexical database (Quer Villanueva, 2017). A diverse set of data collection tasks was used to collect different discourse genres. Some of the tasks were adopted from the DGS Corpus, Corpus NGT , British Sign Language Corpus and Auslan Corpus .The corpus was created at the Institute for Catalan Studies, led by Josep Quer Villanueva and Gemma Barberà Altimira.

Language Catalan Sign Language
Size 560 hours recorded, 150870 tokens annotated.
Participants 56 participants
3 age groups: 18–30, 31–50, 51–80 years
From 12 deaf clubs, based in 10 different cities in Catalonia (Barcelona, Terrassa, Manresa, Vic, Lleida, Blanes, Mataró, Badalona, Cambrils, Palafrugell)
Metadata Format information not available
Translation Catalan, 2.5 hours
Annotation Glossed transcription available for most recordings.
Data Format ELAN
Licence CC BY 4.0
Access Public access via browsable homepage
Direct download available for videos in public access and their ELAN annotation files, as well as a gloss list.
Restricted access for researchers to further data requires submission of a data transfer document.
Webpage Dataset: https://corpuslsc.iec.cat/
Institution Institute for Catalan Studies
Publications Barberà et al. (2015)
Quer Villanueva (2017)

Cite as

Institut d’Estudis Catalans. 2025. Corpus de referència de la llengua de signes catalana (LSC) (CORPUS LSC) <https://corpuslsc.iec.cat/>

Common tasks used in this corpus

Hide/Show tasks
Task Deaf life experiences
# recordings – open access 28
# recordings – restricted access 0
Data available https://corpuslsc.iec.cat/en/explanation-of-an-anecdote-related-to-deafness/
Task Debate
# recordings – open access 30
# recordings – restricted access 0
Data available https://corpuslsc.iec.cat/en/debate-the-future-of-associations-of-the-deaf/
Task Frog Story
# recordings – open access 56
# recordings – restricted access 0
Data available https://corpuslsc.iec.cat/en/narration-the-frog-story/
Task Sign Name
# recordings – open access 28
# recordings – restricted access 0
Data available https://corpuslsc.iec.cat/en/presentation-and-name-sign/
Task Sylvester and Tweety
# recordings – open access 56
# recordings – restricted access 0
Data available https://corpuslsc.iec.cat/en/narration-silvester-and-tweety/

References

Primary references

This entry was last modified on 11 April 2025.