This paper presents the current development of the first large parallel corpus between Italian and Italian Sign Language (Lingua Italiana dei Segni, LIS). This initiative has been taken within the ATLAS project (Automatic Translation into Sign Languages), that aims at realizing a virtual interpreter, which automatically translates an Italian text into LIS. The Italian-LIS virtual interpreter is implemented by means of two modules interfaced by the ATLAS Extended Written LIS (AEWLIS), which is a translation-oriented representation of LIS: The first module translates the source Italian text into AEWLIS; the second module transforms the AEWLIS content into a coherent LIS sequence, smoothly animated by a virtual character. As no significant amount of electronic data are available for Italian and LIS, we have started building a parallel corpus from scratch in order to train and tune the Italian-AEWLIS translation system, and to compare the resulting virtual animations with human-performed LIS interpretations. The corpus, which will be freely available, actually presents a tri-lingual structure, with the Italian text, the AEWLIS sequence, and the signed LIS video.
Nicola Bertoldi, Gabriele Tiotto, Paolo Prinetto, Elio Piccolo, Fabrizio Nunnari, Vincenzo Lombardo, Alessandro Mazzei, Rossana Damiano, Leonardo Lesmo, Andrea Del Principe. 2010. On the creation and the annotation of a large-scale Italian-LIS parallel corpus. In Proceedings of the LREC2010 4th Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies, pages 19–22, Valletta, Malta. European Language Resources Association (ELRA).
BibTeX Export
@inproceedings{bertoldi:10054:sign-lang:lrec,
author = {Bertoldi, Nicola and Tiotto, Gabriele and Prinetto, Paolo and Piccolo, Elio and Nunnari, Fabrizio and Lombardo, Vincenzo and Mazzei, Alessandro and Damiano, Rossana and Lesmo, Leonardo and Del Principe, Andrea},
title = {On the creation and the annotation of a large-scale {Italian-LIS} parallel corpus},
pages = {19--22},
editor = {Dreuw, Philippe and Efthimiou, Eleni and Hanke, Thomas and Johnston, Trevor and Mart{\'i}nez Ruiz, Gregorio and Schembri, Adam},
booktitle = {Proceedings of the {LREC2010} 4th Workshop on the Representation and Processing of Sign Languages: Corpora and Sign Language Technologies},
maintitle = {7th International Conference on Language Resources and Evaluation ({LREC} 2010)},
publisher = {{European Language Resources Association (ELRA)}},
address = {Valletta, Malta},
day = {22--23},
month = may,
year = {2010},
language = {english},
url = {https://www.sign-lang.uni-hamburg.de/lrec/pub/10054.pdf}
}