The Sign Language Dataset Compendium


Corpus

British Sign Language Corpus

The British Sign Language Corpus is a collection of British Sign Language video clips of 249 deaf signers from the UK. The BSL Corpus project is based at the Deafness Cognition and Language Research Centre, University College London, lasted from 2008–2011 and was led by Adam Schembri. A related dataset is the BSL SignBank.

Metadata on the participants was collected via 39 questions on personal and language background following the standards for meta data collection by Crasborn and Hanke (2003) and using the IMDI format.

The recordings were made in a studio, using three cameras in three different angles (one on each signer and one on the pair). The participants were recorded in pairs, sitting next to each other in front of a blue background. They were asked in advance to wear plain coloured clothing. The tasks were moderated by a deaf researcher, except for the unobserved conversation mentioned above for which they would leave the room.

Language British Sign Language
Size 125 hours recorded, 70000 tokens annotated
Participants 249 participants
Deaf
4 age groups: 18–35, 35–50, 51–70 and 71 years and older
From 8 cities: London, Bristol, Birmingham, Manchester, Newcastle, Glasgow, Cardiff, Belfast
Balanced for gender, ethnicity, social class and language background
Metadata Format IMDI
Translation English, size unknown
Annotation Annotation based on research projects (different bundles of annotation), basically following Johnston (2010)
Data Format ELAN
Licence CC BY-SA 4.0 (for narrative and lexical elicitation data)
User license (for conversation and interview data)
Access Public access via browsable homepage
Open access to narrative and lexical elicitation data
Restricted access to conversation and interview data requires confirmed registration
Webpages Project page: https://bslcorpusproject.org/
Dataset: https://bslcorpusproject.org/cava/
Public access: https://bslcorpusproject.org/data/region/
Institution University College London
Publications https://bslcorpusproject.org/publications-and-presentations/

Cite as

Video data deposited in 2011: Schembri, Adam, Jordan Fenlon, Ramas Rentelis, & Kearsy Cormier. (2011). British Sign Language Corpus Project: A corpus of digital video data of British Sign Language 2008-2011 (First Edition). London: University College London. (https://www.bslcorpusproject.org)

Annotations deposited in 2014: Schembri, Adam, Jordan Fenlon, Ramas Rentelis, & Kearsy Cormier. (2014). British Sign Language Corpus Project: A corpus of digital video data and annotations of British Sign Language 2008-2014 (Second Edition). London: University College London. (https://www.bslcorpusproject.org)

Annotations including translations deposited in 2017: Schembri, Adam, Jordan Fenlon, Ramas Rentelis, & Kearsy Cormier. (2017). British Sign Language Corpus Project: A corpus of digital video data and annotations of British Sign Language 2008-2017 (Third Edition). London: University College London. (https://www.bslcorpusproject.org)

Common tasks used in this corpus

Task Free conversation
# recordings – open access 0
# recordings – restricted access 452
Data available https://digital-collections.ucl.ac.uk/R/HN97JGFGQ94BPR12C88YTSJ3G95CJ4VVC4SMMY9AF28RV8BJ49-00692?func=collections-result&collection_id=2648
Task Language awareness
# recordings – open access 0
# recordings – restricted access 367
Data available https://digital-collections.ucl.ac.uk/R/1N1I7A9DP4V65Y2LEYRFGQ3PNHEGD8I2BYD113KI6IGX52B6FP-08527?func=collections-result&collection_id=2649
Task Lexical elicitation
# recordings – open access 386
# recordings – restricted access 0
Data available https://digital-collections.ucl.ac.uk/R/HN97JGFGQ94BPR12C88YTSJ3G95CJ4VVC4SMMY9AF28RV8BJ49-00698?func=collections-result&collection_id=2651

Articles mentioned above

This entry was last modified on 7 August 2023.