The Sign Language Dataset Compendium


Kata Kolok Corpus

The Kata Kolok Corpus is a collection of Kata Kolok done by Connie de Vos in Bali.

The Kata Kolok Corpus recorded not only deaf signers, but also hearing signers who are fluent and non-fluent. This is due to the community-centered approach and the community documented being highly bimodal bilingual. Therefore the recordings contain spontaneous signing from deaf signers in monologues, dialogues and multi-party conversation in Kata Kolok, deaf-hearing interaction in multi-party interactions and dialogues, and hearing community members using Bali with co-speech gestures.

Next to this spontaneous conversations aimed elicitation took place with standardised stimulus material as Sylvester and Tweety. Recordings took place all over the village and some recordings were done with more than one camera.

Part of the data was translated into Indonesian. The Indonesian sentences were then translated into English.

A second corpus with signing children as well as a signbank also exist.

Language Kata Kolok
Size 50.5 hours of signing
Participants 47 deaf signers
Hearing, fluent and non-fluent (number unknown)
Metadata Format IMDI enriched following Crasborn and Hanke (2003)
Translation Indonesian, size unknown
English, size unknown
Annotation 4.5 hours annotated
Data Format ELAN
Licence not available
Access Open access to 16 files via The Language Archive
Restricted access to 370 files
Webpage Dataset:
Institution Max Planck Institute for Psycholinguistics, Nijmgen
Publications de Vos (2016)

Cite as

Connie de Vos, Ketut Kanta, Hannah Lutzenberger, Katie Mudd, Made Sumarni, and Ni Made Sumarni. (2007 - 2021). Item "Kata Kolok Corpus" in collection "Vos". The Language Archive. (Accessed [insert date])

Common tasks used in this corpus

Task Free conversation
# recordings – open access 0
# recordings – restricted access 66
Data available
Task Lexical elicitation
# recordings – open access 0
# recordings – restricted access 178
Data available
Task Sylvester and Tweety
# recordings – open access 0
# recordings – restricted access 12
Data available

Articles mentioned above

This entry was last modified on 6 January 2023.