DGS Corpus
The DGS Corpus is a collection of DGSGerman Sign Language data from 330 signers from Germany. The 19-year long-term project is based at the Institute of German Sign Language and Communication of the Deaf at the University of Hamburg and started in 2009. It is led by Thomas Hanke and Annika Herrmann. The DGS Corpus is used to build the DGSGerman Sign Language–German dictionary DW-DGS.
In the timeframe 2010–2012, the signers were recorded in pairs in a mobile studio travelling to thirteen spots in Germany. The signers were sitting opposite each other in front of a blue background. In total seven cameras were used for the recordings, five HD cameras and two Bumblebees. The Bumblebees were later replaced by three HD stereo cameras. The cameras were set up in three different angles: one recording a total view including the moderator, one filming each signer from the front and one from above. The original resolution is 1080i50 for the videos of 2010 and 720p50 for the videos from 2011 onwards. Public data is provided in 360p50. A Deaf moderator was leading through the tasks.
The DGS Corpus is available in different formats:
MY DGS is a community portal which offers an easy access to the data tailored for users interested in the content of the conversations. Videos can be watched in an online viewer with subtitles.
MY DGS – annotated is a research portal which offers the annotated corpus data for linguistic research.
MY DGS – ANNIS is another research portal making the DGS Corpus available via the corpus tool ANNIS, a web browser-based search and visualization architecture for complex multilayer linguistic corpora.
Language | German Sign Language |
---|---|
Size | 560 hours recorded, 657000 tokens annotated |
Participants |
330 participants
4 age groups: 18–30, 31–45, 46–60, 61 years and older 165 female, 165 male From all over Germany, divided into 13 distinct regions |
Metadata Format | CMDI |
Translation | German and English, 375.8 hours (German), 113 hours (English) |
Annotation |
See Konrad et al. (2022)
90.9 hours annotated |
Data Format | iLex |
Licence |
DGS Corpus License |
Access |
Public access via browsable homepage
Open access to 50 hours of video, annotation and translation in iLex, ELAN and SRT format Restricted access for researchers to further data requires individual license agreement |
Webpages |
Project page: https://dgs-korpus.de Dataset: https://ling.meine-dgs.de Public access: https://meine-dgs.de ANNIS: https://annis.meine-dgs.de |
Institution | Institute of German Sign Language and Communication of the Deaf, University of Hamburg |
Publications |
Hanke et al. (2020)
https://dgs-korpus.de/publications.html |
Cite as
Konrad, R., Hanke, T., Langer, G., Blanck, D., Bleicken, J., Hofmann, I., Jeziorski, O., König, L., König, S., Nishio, R., Regen, A., Salden, U., Wagner, S., Worseck, S., Böse, O., Jahn, E., Schulder, M. 2020. MEINE DGS – annotiert. Öffentliches Korpus der Deutschen Gebärdensprache, 3. Release / MY DGS – annotated. Public Corpus of German Sign Language, 3rd release [Dataset]. Universität Hamburg. https://doi.org/10.25592/dgs.corpus-3.0
Thomas Hanke, Marc Schulder, Reiner Konrad, Elena Jahn (2020). "Extending the Public DGS Corpus in Size and Depth". In: Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives (Marseille, France). Ed. by Eleni Efthimiou, Stavroula-Evita Fotinea, Thomas Hanke, Julie A. Hochgesang, Jette Kristoffersen, Johanna Mesch. Paris, France: European Language Resources Association (ELRA), pp. 75-82. ISBN: 979-10-95546-54-2. ACL: 2020.signlang-1.12.
Common tasks used in this corpus
Hide/Show tasks
Task | Calendar |
---|---|
# recordings – open access | 1 |
# recordings – restricted access | 167 |
Data available |
https://meine-dgs.de/formats/format16_en.html |
Task | Deaf life experiences |
# recordings – open access | 63 |
# recordings – restricted access | 259 |
Data available |
https://meine-dgs.de/formats/format3_en.html |
Task | Debate |
# recordings – open access | 28 |
# recordings – restricted access | 137 |
Data available | https://meine-dgs.de/formats/format6_de.html |
Task | Describe process |
# recordings – open access | 13 |
# recordings – restricted access | 153 |
Data available |
https://meine-dgs.de/formats/format4_en.html |
Task | Diachronic changes |
# recordings – open access | 1 |
# recordings – restricted access | 157 |
Data available |
https://meine-dgs.de/formats/format19_en.html |
Task | Fire Alarm Story |
# recordings – open access | 1 |
# recordings – restricted access | 67 |
Data available |
https://meine-dgs.de/formats/format7_en.html |
Task | Free conversation |
# recordings – open access | 34 |
# recordings – restricted access | 131 |
Data available |
https://meine-dgs.de/formats/format8_en.html |
Task | Frog Story |
# recordings – open access | 1 (in 6 parts) |
# recordings – restricted access | 81 |
Data available |
https://meine-dgs.de/formats/format9_en.html |
Task | Jokes |
# recordings – open access | 88 |
# recordings – restricted access | 49 |
Data available |
https://meine-dgs.de/formats/format21_en.html |
Task | Lexical elicitation |
# recordings – open access | 0 |
# recordings – restricted access | 168 |
Data available | none |
Task | Pear Story |
# recordings – open access | 1 |
# recordings – restricted access | 82 |
Data available |
https://meine-dgs.de/formats/format5_en.html |
Task | Route description |
# recordings – open access | 1 |
# recordings – restricted access | 65 |
Data available |
https://meine-dgs.de/formats/format15_en.html |
Task | Sign Name |
# recordings – open access | 0 |
# recordings – restricted access | 168 |
Data available | none |
Task | Signs Movie |
# recordings – open access | 1 |
# recordings – restricted access | 141 |
Data available |
https://meine-dgs.de/formats/format20_en.html |
Task | Subject areas |
# recordings – open access | 26 (25 subject areas) |
# recordings – restricted access | 349 |
Data available |
https://meine-dgs.de/formats/format13_en.html |
Task | Sylvester and Tweety |
# recordings – open access | 3 (each in 7 parts) |
# recordings – restricted access | 81 |
Data available |
https://meine-dgs.de/formats/format18_en.html |
Task | Warning and prohibition signs |
# recordings – open access | 16 |
# recordings – restricted access | 152 |
Data available |
https://meine-dgs.de/formats/format14_en.html |
Task | What did you do when it happened |
# recordings – open access | 52 |
# recordings – restricted access | 279 |
Data available |
https://meine-dgs.de/formats/format1_en.html |
Task | Your region |
# recordings – open access | 13 |
# recordings – restricted access | 67 |
Data available |
https://meine-dgs.de/formats/format10_en.html |
References
Primary references
- Reiner Konrad, Thomas Hanke, Gabriele Langer, Susanne König, Lutz König, Rie Nishio, Anja Regen (2022). "Public DGS Corpus: Annotation Conventions". Project Note. DOI: 10.25592/uhhfdm.822.
- Thomas Hanke, Marc Schulder, Reiner Konrad, Elena Jahn (2020). "Extending the Public DGS Corpus in Size and Depth". In: Proceedings of the LREC2020 9th Workshop on the Representation and Processing of Sign Languages: Sign Language Resources in the Service of the Language Community, Technological Challenges and Application Perspectives (Marseille, France). Ed. by Eleni Efthimiou, Stavroula-Evita Fotinea, Thomas Hanke, Julie A. Hochgesang, Jette Kristoffersen, Johanna Mesch. Paris, France: European Language Resources Association (ELRA), pp. 75-82. ISBN: 979-10-95546-54-2. ACL: 2020.signlang-1.12.
This entry was last modified on 27 March 2025.