Longitudinal, spontaneous production data have long been a cornerstone of language acquisition studies, but building corpora of sign language acquisition data poses considerable challenges. Our experience began with the development of a sign language acquisition corpus more than 15 years ago and has recently included a small-scale experiment in corpus sharing between our two research groups. Our combined database includes regular samples of deaf and hearing children between the ages of 1;06 to 3;06 years acquiring ASL as their native language. The process through which we generate and share transcripts has undergone dramatic changes, always with the triple goal of creating transcripts with sufficient information for the reader to locate regions of interest, while keeping the video fully accessible and minimizing the time required to generate transcripts. In this paper we summarize the various incarnations of our transcription system, from simple Word documents with minimal integration of video, to a combination of FileMaker Pro software integrated with Autolog, to a fully integrated transcript+video package in ELAN. Along the way, we discuss the potential of ELAN to surmount several obstacles that have traditionally stood in the way of large-scale corpus sharing in the sign language acquisition community.
Diane Lillo-Martin, Deborah Chen Pichler. 2008. Development of Sign Language Acquisition Corpora. In Proceedings of the LREC2008 3rd Workshop on the Representation and Processing of Sign Languages: Construction and Exploitation of Sign Language Corpora, pages 129–133, Marrakech, Morocco. European Language Resources Association (ELRA).
BibTeX Export
@inproceedings{lillomartin:08035:sign-lang:lrec,
author = {Lillo-Martin, Diane and Chen Pichler, Deborah},
title = {Development of Sign Language Acquisition Corpora},
pages = {129--133},
editor = {Crasborn, Onno and Efthimiou, Eleni and Hanke, Thomas and Thoutenhoofd, Ernst D. and Zwitserlood, Inge},
booktitle = {Proceedings of the {LREC2008} 3rd Workshop on the Representation and Processing of Sign Languages: Construction and Exploitation of Sign Language Corpora},
maintitle = {6th International Conference on Language Resources and Evaluation ({LREC} 2008)},
publisher = {{European Language Resources Association (ELRA)}},
address = {Marrakech, Morocco},
day = {1},
month = jun,
year = {2008},
language = {english},
url = {https://www.sign-lang.uni-hamburg.de/lrec/pub/08035.pdf}
}