The Sign Language Dataset Compendium


Corpus

Corpus NGT

The Corpus NGT is an open access online corpus of dialogues between 100 native users of Sign Language of the Netherlands. Corpus NGT is based at Radboud University Nijmegen and was created by Onno Crasborn, Inge Zwitserlood and Johan Ros in a two-year project from 2006–2008. Different follow-up projects worked with the corpus and extended the amount of annotated data available online. The Corpus NGT team created their own annotation conventions (see Crasborn et al., 2020). A special feature is the voice-over translation for parts of the data, instead of the common translations into written Dutch, which also exists for other parts of the corpus.

Recordings are made with two HDV cameras and two digital MiniDV cameras. The participants were recorded in pairs, sitting opposite each other in front of a dark background. The signing is captured in a front view and a top view (from above). Recordings took place at the Radboud University and the Max Planck Institute for Psycholinguistics as well as at Deaf schools, Deaf clubs and other familiar places to the Deaf participants. A Deaf signer led the participants through the recordings.

Language Sign Language of the Netherlands
Size 72 hours recorded, 150000 tokens and 3300 types annotated
Participants 100 participants
Deaf, native signers
8 age groups: 11–19, 20–29, 30–39, 40–49, 50–59, 60–69, 70–79, 80–89 years
From 5 regions: Amsterdam, Groningen, Rotterdam, Gestel, Voorburg
Metadata Format CMDI
Translation Dutch, 15 hours translated, 15000 sentences (21%)
Annotation See Crasborn et al. (2020)
Data Format ELAN
Licence Openly accessible videos under CC BY-NC-SA 3.0 NL
Annnotations under CC BY-NC-SA 4.0
Access Public access via browsable homepage
Open access to some video and annotation material via The Language Archive
Restricted access to some video and metadata requires individual license agreement
Webpages Project page: https://www.ru.nl/en/departments/centre-for-language-studies/sign-language-linguistics
Dataset: https://hdl.handle.net/1839/00-0000-0000-0004-DF8E-6
Public access: https://www.corpusngt.nl/
Institution Radboud University Nijmegen
Publications https://oametisp.uci.ru.nl/metisprd/pk_apa_n.results?p_url_id=29861

Cite as

Onno Crasborn, Inge Zwitserlood & Johan Ros. 2008. The Corpus NGT. An open access digital corpus of movies with annotations of Sign Language of the Netherlands. Centre for Language Studies, Radboud University Nijmegen. URL: http://hdl.handle.net/hdl:1839/00-0000-0000-0004-DF8E-6 ISLRN: 175-346-174-413-3

Onno Crasborn & Inge Zwitserlood (2008) The Corpus NGT: an online corpus for professionals and laymen, In: Construction and Exploitation of Sign Language Corpora. 3rd Workshop on the Representation and Processing of Sign Languages, O. Crasborn, T. Hanke, E. Efthimiou, I. Zwitserlood & E. Thoutenhoofd, eds. ELDA, Paris. pp 44-49.

Common tasks used in this corpus

Task Deaf life experiences
# recordings – open access 60
# recordings – restricted access 3
Data available https://hdl.handle.net/1839/00-0000-0000-0009-06F6-0
Task Free conversation
# recordings – open access 60
# recordings – restricted access 0
Data available https://hdl.handle.net/1839/00-0000-0000-0009-06F8-6
Task Frog Story
# recordings – open access 42
# recordings – restricted access 1
Data available https://hdl.handle.net/1839/00-0000-0000-0009-06F9-3
Task Language awareness
# recordings – open access 43
# recordings – restricted access 2
Data available https://hdl.handle.net/1839/00-0000-0000-0009-06FF-7
Task Present yourself
# recordings – open access 0
# recordings – restricted access 46
Data available https://hdl.handle.net/1839/00-0000-0000-0009-06FA-5
Task Retelling of fables
# recordings – open access 55
# recordings – restricted access 10
Data available https://hdl.handle.net/1839/00-0000-0000-0009-06F7-8
Task Sylvester and Tweety
# recordings – open access 46
# recordings – restricted access 7
Data available https://hdl.handle.net/1839/00-0000-0000-0009-06F5-E

Articles mentioned above

This entry was last modified on 4 July 2023.