The Sign Language Dataset Compendium


Corpus

DGS Corpus

The DGS Corpus is a collection of DGSGerman Sign Language data from 330 signers from Germany. The 19-year long-term project is based at the Institute of German Sign Language and Communication of the Deaf at the University of Hamburg and started in 2009. It is led by Thomas Hanke and Annika Herrmann. The DGS Corpus is used to build the DGSGerman Sign LanguageGerman dictionary DW-DGS.

In the timeframe 2010–2012, the signers were recorded in pairs in a mobile studio travelling to thirteen spots in Germany. The signers were sitting opposite each other in front of a blue background. In total seven cameras were used for the recordings, five HD cameras and two Bumblebees. The Bumblebees were later replaced by three HD stereo cameras. The cameras were set up in three different angles: one recording a total view including the moderator, one filming each signer from the front and one from above. The original resolution is 1080i50 for the videos of 2010 and 720p50 for the videos from 2011 onwards. Public data is provided in 360p50. A Deaf moderator was leading through the tasks.

The DGS Corpus is available in different formats:

MY DGS is a community portal which offers an easy access to the data tailored for users interested in the content of the conversations. Videos can be watched in an online viewer with subtitles.

MY DGS – annotated is a research portal which offers the annotated corpus data for linguistic research.

MY DGS – ANNIS is another research portal making the DGS Corpus available via the corpus tool ANNIS, a web browser-based search and visualization architecture for complex multilayer linguistic corpora.

Language German Sign Language
Size 560 hours recorded, 657000 tokens annotated
Participants 330 participants
4 age groups: 18–30, 31–45, 46–60, 61 years and older
165 female, 165 male
From all over Germany, divided into 13 distinct regions
Metadata Format CMDI
Translation German and English, 375.8 hours (German), 113 hours (English)
Annotation See Konrad et al. (2022)
90.9 hours annotated
Data Format iLex
Licence DGS Corpus License
Access Public access via browsable homepage
Open access to 50 hours of video, annotation and translation in iLex, ELAN and SRT format
Restricted access for researchers to further data requires individual license agreement
Webpages Project page: https://dgs-korpus.de
Dataset: https://ling.meine-dgs.de
Public access: https://meine-dgs.de
ANNIS: https://annis.meine-dgs.de
Institution Institute of German Sign Language and Communication of the Deaf, University of Hamburg
Publications Hanke et al. (2020)
https://dgs-korpus.de/publications.html

Cite as

Konrad, R., Hanke, T., Langer, G., Blanck, D., Bleicken, J., Hofmann, I., Jeziorski, O., König, L., König, S., Nishio, R., Regen, A., Salden, U., Wagner, S., Worseck, S., Böse, O., Jahn, E., Schulder, M. 2020. MEINE DGS – annotiert. Öffentliches Korpus der Deutschen Gebärdensprache, 3. Release / MY DGS – annotated. Public Corpus of German Sign Language, 3rd release [Dataset]. Universität Hamburg. https://doi.org/10.25592/dgs.corpus-3.0

Thomas Hanke, Marc Schulder, Reiner Konrad, Elena Jahn (2020). "Extending the Public DGS Corpus in Size and Depth". In: Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives (Marseille, France). Ed. by Eleni Efthimiou, Stavroula-Evita Fotinea, Thomas Hanke, Julie A. Hochgesang, Jette Kristoffersen, Johanna Mesch. Paris, France: European Language Resources Association (ELRA), pp. 75-82. ISBN: 979-10-95546-54-2. ACL: 2020.signlang-1.12.

Common tasks used in this corpus

Hide/Show tasks
Task Calendar
# recordings – open access 1
# recordings – restricted access 167
Data available https://meine-dgs.de/formats/format16_en.html
Task Deaf life experiences
# recordings – open access 63
# recordings – restricted access 259
Data available https://meine-dgs.de/formats/format3_en.html
Task Debate
# recordings – open access 28
# recordings – restricted access 137
Data available https://meine-dgs.de/formats/format6_de.html
Task Describe process
# recordings – open access 13
# recordings – restricted access 153
Data available https://meine-dgs.de/formats/format4_en.html
Task Diachronic changes
# recordings – open access 1
# recordings – restricted access 157
Data available https://meine-dgs.de/formats/format19_en.html
Task Fire Alarm Story
# recordings – open access 1
# recordings – restricted access 67
Data available https://meine-dgs.de/formats/format7_en.html
Task Free conversation
# recordings – open access 34
# recordings – restricted access 131
Data available https://meine-dgs.de/formats/format8_en.html
Task Frog Story
# recordings – open access 1 (in 6 parts)
# recordings – restricted access 81
Data available https://meine-dgs.de/formats/format9_en.html
Task Jokes
# recordings – open access 88
# recordings – restricted access 49
Data available https://meine-dgs.de/formats/format21_en.html
Task Lexical elicitation
# recordings – open access 0
# recordings – restricted access 168
Data available none
Task Pear Story
# recordings – open access 1
# recordings – restricted access 82
Data available https://meine-dgs.de/formats/format5_en.html
Task Route description
# recordings – open access 1
# recordings – restricted access 65
Data available https://meine-dgs.de/formats/format15_en.html
Task Sign Name
# recordings – open access 0
# recordings – restricted access 168
Data available none
Task Signs Movie
# recordings – open access 1
# recordings – restricted access 141
Data available https://meine-dgs.de/formats/format20_en.html
Task Subject areas
# recordings – open access 26 (25 subject areas)
# recordings – restricted access 349
Data available https://meine-dgs.de/formats/format13_en.html
Task Sylvester and Tweety
# recordings – open access 3 (each in 7 parts)
# recordings – restricted access 81
Data available https://meine-dgs.de/formats/format18_en.html
Task Warning and prohibition signs
# recordings – open access 16
# recordings – restricted access 152
Data available https://meine-dgs.de/formats/format14_en.html
Task What did you do when it happened
# recordings – open access 52
# recordings – restricted access 279
Data available https://meine-dgs.de/formats/format1_en.html
Task Your region
# recordings – open access 13
# recordings – restricted access 67
Data available https://meine-dgs.de/formats/format10_en.html

References

Primary references

This entry was last modified on 27 March 2025.