The Sign Language Dataset Compendium


Japanese Sign Language Colloquial Corpus

The Japanese Sign Language Colloquial Corpus is a collection of movie clips from Japanese Sign Language signers. Construction of the corpus started in April 2011; filming took place from May to July 2012.

Signers were recorded in pairs and each session lasted 1.5 hours. Three HD cameras were used for filming, one for each signer and one for a total view. Signers were sitting opposite each other in front of a blue background. Data was collected via interviews, dialogues and lexical elicitation tasks. The tasks were led by two field workers using the local signs.

The research and hence also the annotation scheme focus not only on theretical issues related to grammar and linguistics, but also to pragmatic and interactional phenomenas.

Language Japanese Sign Language
Size 40 hours of recording
140 video clips
27371 tokens
Participants 120 participants
20–80 years
66 male, 54 female
From 7 prefectures: Gunma, Nara, Nagasaki, Fukuoka, Ishikawa, Toyama, Ibaraki
Metadata Format not available
Translation Japanese, size unknown
Annotation 80 files (60%) with basic annotation
Data Format ELAN
Licence JSL Colloquial Corpus licence
Access Public access to recordings of dialogue and elicitation of Nara and Gumma via browsable homepage
Restricted access to interview recordings of Nara and Gumma and all recordings of Fukuoka, Ishikawa, Toyama, Ibaraki
More content publicly available on Japanese webpage only
Webpages Project page with Dataset:
Researcher portal:
Institution Bono Lab
Publications Bono et al. (2014)
Bono et al. (2020)
For further references, see

Cite as

HP name: Corpus Project in Colloquial Japanese Sign Language


Paper: Bono, Mayumi., Kikuchi, Kouhei., Cibulka, Paul., and Osugi, Yutaka. (2014) Colloquial Corpus of Japanese Sign Language: A Design of Language Resources for Observing Sign Language Conversations. Proc. of The 9th edition of the Language Resources and Evaluation Conference, pp.1898-1904. (May 26-31, Reykjavik, Iceland)

Common tasks used in this corpus

Task Lexical elicitation
# recordings – open access 20
# recordings – restricted access 0
Data available
Task Sylvester and Tweety
# recordings – open access 40
# recordings – restricted access 0
Data available

Articles mentioned above

This entry was last modified on 17 May 2023.