British Sign Language Corpus
The British Sign Language Corpus is a collection of British Sign Language video clips of 249 deaf signers from the UK. The BSL Corpus project is based at the Deafness Cognition and Language Research Centre, University College London, lasted from 2008–2011 and was led by Adam Schembri. A related dataset is the BSL SignBank.
Metadata on the participants was collected via 39 questions on personal and language background following the standards for meta data collection by Crasborn and Hanke (2003) and using the IMDI format.
The recordings were made in a studio, using three cameras in three different angles (one on each signer and one on the pair). The participants were recorded in pairs, sitting next to each other in front of a blue background. They were asked in advance to wear plain coloured clothing. The tasks were moderated by a deaf researcher, except for the unobserved conversation mentioned above for which they would leave the room.
|Language||British Sign Language|
|Size||125 hours recorded, 70000 tokens annotated|
4 age groups: 18–35, 35–50, 51–70 and 71 years and older
From 8 cities: London, Bristol, Birmingham, Manchester, Newcastle, Glasgow, Cardiff, Belfast
Balanced for gender, ethnicity, social class and language background
|Translation||English, size unknown|
|Annotation||Annotation based on research projects (different bundles of annotation), basically following Johnston (2010)|
CC BY-SA 4.0 (for narrative and lexical elicitation data)
User license (for conversation and interview data)
Public access via browsable homepage
Open access to narrative and lexical elicitation data
Restricted access to conversation and interview data requires confirmed registration
Project page: https://bslcorpusproject.org/
Public access: https://bslcorpusproject.org/data/region/
|Institution||University College London|
Video data deposited in 2011: Schembri, Adam, Jordan Fenlon, Ramas Rentelis, & Kearsy Cormier. (2011). British Sign Language Corpus Project: A corpus of digital video data of British Sign Language 2008-2011 (First Edition). London: University College London. (https://www.bslcorpusproject.org)
Annotations deposited in 2014: Schembri, Adam, Jordan Fenlon, Ramas Rentelis, & Kearsy Cormier. (2014). British Sign Language Corpus Project: A corpus of digital video data and annotations of British Sign Language 2008-2014 (Second Edition). London: University College London. (https://www.bslcorpusproject.org)
Annotations including translations deposited in 2017: Schembri, Adam, Jordan Fenlon, Ramas Rentelis, & Kearsy Cormier. (2017). British Sign Language Corpus Project: A corpus of digital video data and annotations of British Sign Language 2008-2017 (Third Edition). London: University College London. (https://www.bslcorpusproject.org)
Common tasks used in this corpus
|# recordings – open access||0|
|# recordings – restricted access||452|
|# recordings – open access||0|
|# recordings – restricted access||367|
|# recordings – open access||386|
|# recordings – restricted access||0|
Articles mentioned above
- Onno A. Crasborn, Thomas Hanke (2003). "Additions to the IMDI metadata set for sign language corpora. Agreements at an ECHO workshop". Workshop Report. 7 pp.
- Trevor Johnston (2010). "From archive to corpus: Transcription and annotation in the creation of signed language corpora". In: International Journal of Corpus Linguistics 15(1). John Benjamins, pp. 106-131. ISSN: 1569-9811. DOI: 10.1075/ijcl.15.1.05joh.
This entry was last modified on 7 August 2023.