The Sign Language Dataset Compendium


Corpus

SIGNOR Corpus

The SIGNOR Corpus of SZJ is a collection of Slovene Sign Language video data from 80 signers of Slovenia. The Corpus Signor project was based at the University of Ljubljana, ran from 2011–2014 and was led by Špela Vintar.

The annotation is based largely on the DGS Corpus Conventions (Konrad et al., 2022). Seven layers of annotation are provided: segmentation or tokenisation, glossing or lemmatisation, mouthing, HamNoSys transcription, Meaning, compositional meaning and segmentation into utterances. For the database of meanings the Slovene WordNet SloWNet (Fišer and Sagot, 2015) was adapted.

The recordings took place at the premises of Deaf clubs, partially at the informants homes and at the Deaf Institute Ljubljana. A moderator lead the participants through the tasks.

Language Slovene Sign Language
Size 40 hours recorded, 30335 tokens and 1976 types annotated
Participants 80 participants
Metadata Format not available
Translation not available
Annotation Based on Konrad et al. (2022)
See Jerko and Vintar (2015) for more information
Data Format iLex
Licence not available
Access Public access via browsable homepage (temporarily unavailable at the time of writing)
Webpage Project page: http://lojze.lugos.si/signor/en.html
Institution University of Ljubljana
Publications http://lojze.lugos.si/signor/en.html#objave

Cite as

not available

Common tasks used in this corpus

Task Frog Story
# recordings – open access 0
# recordings – restricted access not available
Data available none
Task Present yourself
# recordings – open access 0
# recordings – restricted access not available
Data available none

Articles mentioned above

This entry was last modified on 6 January 2023.