Using sign language corpora as bilingual corpora for data mining: Contrastive linguistics and computer-assisted annotation
Meurant, Laurence | Cleve, Anthony | Crasborn, Onno 
- Volume:
- Proceedings of the LREC2016 7th Workshop on the Representation and Processing of Sign Languages: Corpus Mining
- Venue:
- Portorož, Slovenia
- Date:
- 28 May 2016
- Pages:
- 159–166
- Publisher:
- European Language Resources Association (ELRA)
- License:
- CC BY-NC 4.0
- sign-lang ID:
- 16032
Content Categories
- Languages:
- French Belgian Sign Language, Sign Language of the Netherlands
- Corpora:
- Corpus NGT, Corpus LSFB
- Lexical Databases:
- Lex-LSFB
Abstract
More and more sign languages nowadays are now documented by large-scale digital corpora. But exploiting sign language (SL) corpus data remains subject to the time consuming and expensive manual task of annotating. In this paper, we present an ongoing research that aims at testing a new approach to better mine SL data. It relies on the methodology of corpus-based contrastive linguistics, exploiting SL corpora as bilingual corpora. We present and illustrate the main improvements we foresee in developing such an approach: downstream, for the benefit of the linguistic description and the bilingual (signed - spoken) competence of teachers, learners and the users; and upstream, in order to enable the automatisation of the annotation process of sign language data. We also describe the methodology we are using to develop a concordancer able to turn SL corpora into searchable translation corpora, and to derive from it a tool support to annotation.Document Download
Paper PDF BibTeX File + Abstract
Cite as
Citation in ACL Citation Format
Laurence Meurant, Anthony Cleve, Onno Crasborn. 2016. Using sign language corpora as bilingual corpora for data mining: Contrastive linguistics and computer-assisted annotation. In Proceedings of the LREC2016 7th Workshop on the Representation and Processing of Sign Languages: Corpus Mining, pages 159–166, Portorož, Slovenia. European Language Resources Association (ELRA).BibTeX Export
@inproceedings{meurant:16032:sign-lang:lrec, author = {Meurant, Laurence and Cleve, Anthony and Crasborn, Onno}, title = {Using sign language corpora as bilingual corpora for data mining: Contrastive linguistics and computer-assisted annotation}, pages = {159--166}, editor = {Efthimiou, Eleni and Fotinea, Stavroula-Evita and Hanke, Thomas and Hochgesang, Julie A. and Kristoffersen, Jette and Mesch, Johanna}, booktitle = {Proceedings of the {LREC2016} 7th Workshop on the Representation and Processing of Sign Languages: Corpus Mining}, maintitle = {10th International Conference on Language Resources and Evaluation ({LREC} 2016)}, publisher = {{European Language Resources Association (ELRA)}}, address = {Portoro{\v z}, Slovenia}, day = {28}, month = may, year = {2016}, language = {english}, url = {https://www.sign-lang.uni-hamburg.de/lrec/pub/16032.pdf} }