Welcome to the Public DGS Corpus!

In this portal you find 50 hours of video materials from the DGS-Korpus project made available together with annoations for research purposes.

If you want to download materials, please pay attention to the license conditions.

By clicking Transcripts you can view the available data sorted by either transcript name ordered by either transcript name or elicitation format. Several download links are available. By clicking the transcript name, you can open an online preview of the transcript. If you want to browse the videos first without paying attention to the annotation and can read German, you find videos with subtitles in the MEINE DGS portal where the videos are available with subtitles.

By clicking Types, you can view the list of all sign types occurring in the public corpus. Click on one of these types to see all corresponding tokens in the public corpus. Clicking once again on a token reference brings directly to the occurrence in the transcript.

For all transcripts, keywords have been assigned in order to provide a rough content-related access to the data. By clicking Topics, you can view an index of all keywords and find the transcripts they have been assigned to.

Background Information

Data Elicitation Formats

We used a set of 20 different tasks for the informant. The formats ranged from story-retelling (with prompts in sign, picture, or movie) to discussions on a given topic as well as free conversations. With careful planning, it was possible to make the mix of formats diverting enough that most participants enjoyed the recording session despite a net length of 5 hours.

The set contains a number of tasks previously used in other corpus projects, both on spoken and sign languages, to lay a basis for cross-linguistic research, as well as new formats. Not all details of the newly developed elicitation materials are available in the publications in order to keep the material suitable for future data collections. The materials are, however, available to other researchers upon request.

For more detail, please consult the following publications:

  • Hanke, Thomas / Hong, Sung-Eun / König, Susanne / Langer, Gabriele / Nishio, Rie / Rathmann, Christian (2010): “Designing Elicitation Stimuli and Tasks for the DGS Corpus Project”. Poster presented at the Theoretical Issues in Sign Language Research Conference (TISLR 10), Sept 30– Oct 2, 2010 at Purdue University, Indiana, USA. [Poster]
  • Nishio, Rie / Hong, Sung-Eun / König, Susanne / Konrad, Reiner / Langer, Gabriele / Hanke, Thomas / Rathmann, Christian (2010). “Elicitation methods in the DGS (German Sign Language) Corpus Project”. Poster presented at the 4th Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, following the 2010 LREC Conference in Malta, May 22.-23., 2010. Workshop Proceedings. W13. 4th Workshop on Representation and Processing of Sign Languages: Corpora and Sign Language Technologies. May 22/23 2010. Valetta – Malta. Paris: ELRA, pp. 178-185. [Paper] [Poster]

Data Collection Regions

From experiences in earlier projects, it was one of the key decisions to have a mobile studio to be set up in different places across Germany. The idea was to have as much of a “local” spirit with the recording location in the region and all persons involved coming from that region while still ensuring high-quality recordings needed for transcription. Obviously, the number of locations selected for recordings needs to be a compromise between localness in the above sense, but also relevant for the informants’ travel times, and the logistics.

Our solution was the definition of thirteen data collection regions, trying to respect the catchment areas of current and former Schools for the Deaf, state (Bundesland) borders determining a. o. educational settings, especially the former border between West and East Germany, suspected dialectal borders, but also practical considerations such as travel times to the recording locations. The regions were further subdivided into up to five sub-regions relevant for informant selection. Large metropolitan areas form their own sub-regions, in contrast to others with mixed or more rural structures.

Below to the left, you find a map of Germany showing the data collection regions. For comparison, you have a map of Germany showing the states (Bundesländer) on the right.