Download the data

The data is available for download in several formats. All files are encoded in UTF-8.

Are you a data scientist working with sign languages for the first time? Please take the time to familiarise yourself with the topic. There are many pitfalls to be aware of. We suggest reading the paper Fox, N., Woll, B. & Cormier, K. Best practices for sign language technology research. Univ Access Inf Soc (2023).

CSV

Note that some fields might contain commas, and are surrounded by quote chars where needed.

The possible values for the “confidence” column are as follow:

TAB files for NLTK

This format allows you to easily import the new languages into the NLTK Wordnet library, and allows you to use NLTK's usual functions with the new languages. NLTK does not allow custom synsets, so this format is missing the synsets we created and the signs linked to them. As per NLTK's requirements, languages are identified by their ISO 639-3 language codes, so, for example, German Sign Language is referred to by the code gsg, rather than its common acronym DGS.

For each language we provide three different formats for textually representing signs as wordnet lemmas: By the type ID used in our resource, by their gloss/keywords, or by their video URL. Note that the video URL format only contains entries for which such a URL is available.

Lang. ISO Lemma is Type ID Lemma is Gloss/keyword Lemma is Video URL
BSL bfi sign_wordnet_video_bfi.tab sign_wordnet_gloss_bfi.tab sign_wordnet_video_bfi.tab
DGS gsg sign_wordnet_video_gsg.tab sign_wordnet_gloss_gsg.tab sign_wordnet_video_gsg.tab
DSGS sgg sign_wordnet_video_sgg.tab sign_wordnet_gloss_sgg.tab sign_wordnet_video_sgg.tab
GSL gss sign_wordnet_video_gss.tab sign_wordnet_gloss_gss.tab sign_wordnet_video_gss.tab
LSF fsl sign_wordnet_video_fsl.tab sign_wordnet_gloss_fsl.tab sign_wordnet_video_fsl.tab
NGT dse sign_wordnet_video_dse.tab sign_wordnet_gloss_dse.tab sign_wordnet_video_dse.tab
PJM pso sign_wordnet_video_pso.tab sign_wordnet_gloss_pso.tab sign_wordnet_video_pso.tab
STS swl sign_wordnet_video_swl.tab sign_wordnet_gloss_swl.tab sign_wordnet_video_swl.tab

Usage examples

Using the NGT sign videos in NLTK:

from nltk.corpus import wordnet as wn
with open("sign_wordnet_video_dse.tab",mode="r",encoding="utf-8") as f:
  wn.custom_lemmas(f,"dse")
wn.synset_from_pos_and_offset("n",2129165).lemma_names("dse")
>>>> ['https://signbank.cls.ru.nl/dictionary/protected_media/glossvideo/NGT/LE/LEEUW-B-22.mp4', 'https://signbank.cls.ru.nl/dictionary/protected_media/glossvideo/NGT/LE/LEEUW-A-1759.mp4']

Using the NGT sign glosses in NLTK:

from nltk.corpus import wordnet as wn
with open("sign_wordnet_gloss_dse.tab",mode="r",encoding="utf-8") as f:
  wn.custom_lemmas(f,"dse")
wn.synset_from_pos_and_offset("n",2129165).lemma_names("dse")
>>>> ['LEEUW-B', 'LION-B', 'LEEUW-A', 'LION-A']
sense=wn.synsets("LION-B",lang="dse")[0]
sense.offset()
>>>> 2129165
sense.pos()
>>>> n
sense.definition()
>>>> large gregarious predatory feline of Africa and India having a tawny coat with a shaggy mane in the male