The Sign Language Dataset Compendium


Corpus

DGS Corpus

The DGS Corpus is a collection of German Sign Language data from 330 signers from Germany. The 15-year long-term project is based at the Institute of German Sign Language and Communication of the Deaf at the Universität Hamburg and started in 2009. It is led by Thomas Hanke and Annika Herrmann. The DGS Corpus is used to build the DGS-German dictionary DW-DGS.

The signers were recorded in pairs in a mobile studio travelling to 13 spots in Germany. The signers were sitting opposite each other in front of a blue background. In total seven cameras were used for the recordings, five HD cameras and two Bumblebees. The Bumblebees were later replaced by three HD stereo cameras. The cameras were set up in three different angles: one recording a total view including the moderator, one filming the signers from the front and one from above. The original resolution is 1080i50 for the videos of 2010 and 720p50 for the videos from 2011 onwards. Public data is provided in 360p50. A Deaf moderator was leading through the tasks.

The DGS Corpus is available in different formats:

MY DGS is a community portal which offers an easy access to the data tailored for users interested in the content of the conversations. Videos can be watched in an online viewer with subtitles.

MY DGS – annotated is a research portal which offers the annotated corpus data for linguistic research.

MY DGS – ANNIS is another research portal making the DGS Corpus available via the corpus tool ANNIS, a web browser-based search and visualization architecture for complex multilayer linguistic corpora.

Language German Sign Language
Size 560 hours recorded, 657000 tokens annotated
Participants 330 participants
4 age groups: 18–30, 31–45, 46–60, 61 years and older
165 female, 165 male
From all over Germany
Metadata Format CMDI
Translation German and English, 375.8 hours (German), 113 hours (English)
Annotation See Konrad et al. (2022)
90.9 hours annotated
Data Format iLex
Licence DGS Corpus License
Access Public access via browsable homepage
Open access to 50 hours of video, annotation and translation in iLex, ELAN and SRT format
Restricted access for researchers to further data requires individual license agreement
Webpages Project page: https://dgs-korpus.de
Dataset: https://ling.meine-dgs.de
Public access: https://meine-dgs.de
ANNIS: https://annis.meine-dgs.de
Institution Universität Hamburg
Publications https://dgs-korpus.de/publications.html

Cite as

Konrad, R., Hanke, T., Langer, G., Blanck, D., Bleicken, J., Hofmann, I., Jeziorski, O., König, L., König, S., Nishio, R., Regen, A., Salden, U., Wagner, S., Worseck, S., Böse, O., Jahn, E., Schulder, M. 2020. MEINE DGS – annotiert. Öffentliches Korpus der Deutschen Gebärdensprache, 3. Release / MY DGS – annotated. Public Corpus of German Sign Language, 3rd release [Dataset]. Universität Hamburg. https://doi.org/10.25592/dgs.corpus-3.0

Common tasks used in this corpus

Task Calendar
# recordings – open access 1
# recordings – restricted access 167
Data available https://meine-dgs.de/formats/format16_en.html
Task Deaf life experiences
# recordings – open access 63
# recordings – restricted access 259
Data available https://meine-dgs.de/formats/format3_en.html
Task Describe process
# recordings – open access 13
# recordings – restricted access 153
Data available https://meine-dgs.de/formats/format4_en.html
Task Diachronic changes
# recordings – open access 1
# recordings – restricted access 157
Data available https://meine-dgs.de/formats/format19_en.html
Task Fire Alarm Story
# recordings – open access 1
# recordings – restricted access 67
Data available https://meine-dgs.de/formats/format7_en.html
Task Free conversation
# recordings – open access 34
# recordings – restricted access 131
Data available https://meine-dgs.de/formats/format8_en.html
Task Frog Story
# recordings – open access 1 (in 6 parts)
# recordings – restricted access 81
Data available https://meine-dgs.de/formats/format9_en.html
Task Jokes
# recordings – open access 88
# recordings – restricted access 49
Data available https://meine-dgs.de/formats/format21_en.html
Task Lexical elicitation
# recordings – open access 0
# recordings – restricted access 168
Data available none
Task Pear Story
# recordings – open access 1
# recordings – restricted access 82
Data available https://meine-dgs.de/formats/format5_en.html
Task Route description
# recordings – open access 1
# recordings – restricted access 65
Data available https://meine-dgs.de/formats/format15_en.html
Task Sign Name
# recordings – open access 0
# recordings – restricted access 168
Data available none
Task Signs Movie
# recordings – open access 1
# recordings – restricted access 141
Data available https://meine-dgs.de/formats/format20_en.html
Task Subject areas
# recordings – open access 26 (25 subject areas)
# recordings – restricted access 349
Data available https://meine-dgs.de/formats/format13_en.html
Task Sylvester and Tweety
# recordings – open access 3 (each in 7 parts)
# recordings – restricted access 81
Data available https://meine-dgs.de/formats/format18_en.html
Task Warning and prohibition signs
# recordings – open access 16
# recordings – restricted access 152
Data available https://meine-dgs.de/formats/format14_en.html
Task What did you do when it happened
# recordings – open access 52
# recordings – restricted access 279
Data available https://meine-dgs.de/formats/format1_en.html
Task Your region
# recordings – open access 13
# recordings – restricted access 67
Data available https://meine-dgs.de/formats/format10_en.html

Articles mentioned above

This entry was last modified on 7 August 2023.