We report on the high success rates of our new, scalable, signer-independent, computational approach for sign recognition from monocular video, exploiting linguistically annotated ASL data sets. We recognize signs using a hybrid framework that combines state-of-the-art learning methods with features based on what is known about the linguistic composition of lexical signs. We model and recognize the sub-components of sign production, with attention to hand shape, orientation, location, motion trajectories, as well as facial features, and we combine these within a CRF framework. The effect is to make the sign recognition problem robust, scalable, and feasible with relatively smaller datasets than are required for purely data-driven methods. From a 350-sign vocabulary of isolated, citation-form lexical signs from the American Sign Language Lexicon Video Dataset (ASLLVD), including both 1- and 2-handed signs, we achieve a top-1 accuracy of 93.6% and a top-5 accuracy of 97.9%. The high probability with which we can produce 5 sign candidates that contain the correct result opens the door to potential applications, as it is reasonable to provide a sign lookup functionality that offers the user 5 possible signs, in decreasing order of likelihood, with the user then asked to select the desired sign.
Keywords
Computer recognition of sign language and steps towards automatic annotation
@inproceedings{metaxas:18005:sign-lang:lrec,
author = {Metaxas, Dimitris and Dilsizian, Mark and Neidle, Carol},
title = {Scalable {ASL} Sign Recognition using Model-based Machine Learning and Linguistically Annotated Corpora},
pages = {127--132},
editor = {Bono, Mayumi and Efthimiou, Eleni and Fotinea, Stavroula-Evita and Hanke, Thomas and Hochgesang, Julie A. and Kristoffersen, Jette and Mesch, Johanna and Osugi, Yutaka},
booktitle = {Proceedings of the {LREC2018} 8th Workshop on the Representation and Processing of Sign Languages: Involving the Language Community},
maintitle = {11th International Conference on Language Resources and Evaluation ({LREC} 2018)},
publisher = {{European Language Resources Association (ELRA)}},
address = {Miyazaki, Japan},
day = {12},
month = may,
year = {2018},
isbn = {979-10-95546-01-6},
language = {english},
url = {https://www.sign-lang.uni-hamburg.de/lrec/pub/18005.pdf}
}