Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into Sign Language Production
Saunders, Ben | Camgöz, Necati Cihan | Bowden, Richard
- Volume:
- Proceedings of the 7th International Workshop on Sign Language Translation and Avatar Technology: The Junction of the Visual and the Textual: Challenges and Perspectives
- Venue:
- Marseille, France
- Date:
- 24 June 2022
- Pages:
- 95–102
- Publisher:
- European Language Resources Association (ELRA)
- License:
- CC BY-NC 4.0
- ACL ID:
- 2022.sltat-1.15
- ISBN:
- 979-10-95546-82-5
Content Categories
- Projects:
- Content4All, ExTOL, SMILE
- Languages:
- German Sign Language, German
- Corpora:
- RWTH-PHOENIX Weather 2014 T
Abstract
Recent approaches to Sign Language Production (SLP) have adopted spoken language Neural Machine Translation (NMT) architectures, applied without sign-specific modifications. In addition, these works represent sign language as a sequence of skeleton pose vectors, projected to an abstract representation with no inherent skeletal structure. In this paper, we represent sign language sequences as a skeletal graph structure, with joints as nodes and both spatial and temporal connections as edges. To operate on this graphical structure, we propose Skeletal Graph Self-Attention (SGSA), a novel graphical attention layer that embeds a skeleton inductive bias into the SLP model. Retaining the skeletal feature representation throughout, we directly apply a spatio-temporal adjacency matrix into the self-attention formulation. This provides structure and context to each skeletal joint that is not possible when using a non-graphical abstract representation, enabling fluid and expressive sign language production. We evaluate our Skeletal Graph Self-Attention architecture on the challenging RWTH-PHOENIX-Weather-2014T (PHOENIX14T) dataset, achieving state-of-the-art back translation performance with an 8% and 7% improvement over competing methods for the dev and test sets.Document Download
Paper PDF BibTeX File + Abstract
Cite as
Citation in ACL Citation Format
Ben Saunders, Necati Cihan Camgöz, Richard Bowden. 2022. Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into Sign Language Production. In Proceedings of the 7th International Workshop on Sign Language Translation and Avatar Technology: The Junction of the Visual and the Textual: Challenges and Perspectives, pages 95–102, Marseille, France. European Language Resources Association (ELRA).BibTeX Export
@inproceedings{saunders:70001:sltat:lrec, author = {Saunders, Ben and Camg{\"o}z, Necati Cihan and Bowden, Richard}, title = {Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into Sign Language Production}, pages = {95--102}, editor = {Efthimiou, Eleni and Fotinea, Stavroula-Evita and Hanke, Thomas and McDonald, John C. and Shterionov, Dimitar and Wolfe, Rosalee}, booktitle = {Proceedings of the 7th International Workshop on Sign Language Translation and Avatar Technology: The Junction of the Visual and the Textual: Challenges and Perspectives}, maintitle = {13th International Conference on Language Resources and Evaluation ({LREC} 2022)}, publisher = {{European Language Resources Association (ELRA)}}, address = {Marseille, France}, day = {24}, month = jun, year = {2022}, isbn = {979-10-95546-82-5}, language = {english}, url = {http://www.lrec-conf.org/proceedings/lrec2022/workshops/sltat/pdf/2022.sltat-1.15} }