Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into Sign Language Production

Saunders, Ben | Camgöz, Necati Cihan | Bowden, Richard

Volume:: Proceedings of the 7th International Workshop on Sign Language Translation and Avatar Technology: The Junction of the Visual and the Textual: Challenges and Perspectives
Venue:: Marseille, France
Date:: 24 June 2022
Pages:: 95–102
Publisher:: European Language Resources Association (ELRA)
License:: CC BY-NC 4.0
ACL ID:: 2022.sltat-1.15
ISBN:: 979-10-95546-82-5

Content Categories

Projects:: Content4All, ExTOL, SMILE
Languages:: German Sign Language, German
Corpora:: RWTH-PHOENIX Weather 2014 T

Abstract

Recent approaches to Sign Language Production (SLP) have adopted spoken language Neural Machine Translation (NMT) architectures, applied without sign-specific modifications. In addition, these works represent sign language as a sequence of skeleton pose vectors, projected to an abstract representation with no inherent skeletal structure. In this paper, we represent sign language sequences as a skeletal graph structure, with joints as nodes and both spatial and temporal connections as edges. To operate on this graphical structure, we propose Skeletal Graph Self-Attention (SGSA), a novel graphical attention layer that embeds a skeleton inductive bias into the SLP model. Retaining the skeletal feature representation throughout, we directly apply a spatio-temporal adjacency matrix into the self-attention formulation. This provides structure and context to each skeletal joint that is not possible when using a non-graphical abstract representation, enabling fluid and expressive sign language production. We evaluate our Skeletal Graph Self-Attention architecture on the challenging RWTH-PHOENIX-Weather-2014T (PHOENIX14T) dataset, achieving state-of-the-art back translation performance with an 8% and 7% improvement over competing methods for the dev and test sets.

Document Download

Paper PDF BibTeX File + Abstract

Cite as

Citation in ACL Citation Format

Ben Saunders, Necati Cihan Camgöz, Richard Bowden. 2022. Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into Sign Language Production. In Proceedings of the 7th International Workshop on Sign Language Translation and Avatar Technology: The Junction of the Visual and the Textual: Challenges and Perspectives, pages 95–102, Marseille, France. European Language Resources Association (ELRA).

BibTeX Export

@inproceedings{saunders:70001:sltat:lrec,
  author    = {Saunders, Ben and Camg{\"o}z, Necati Cihan and Bowden, Richard},
  title     = {Skeletal Graph Self-Attention: Embedding a Skeleton Inductive Bias into Sign Language Production},
  pages     = {95--102},
  editor    = {Efthimiou, Eleni and Fotinea, Stavroula-Evita and Hanke, Thomas and McDonald, John C. and Shterionov, Dimitar and Wolfe, Rosalee},
  booktitle = {Proceedings of the 7th International Workshop on Sign Language Translation and Avatar Technology: The Junction of the Visual and the Textual: Challenges and Perspectives},
  maintitle = {13th International Conference on Language Resources and Evaluation ({LREC} 2022)},
  publisher = {{European Language Resources Association (ELRA)}},
  address   = {Marseille, France},
  day       = {24},
  month     = jun,
  year      = {2022},
  isbn      = {979-10-95546-82-5},
  language  = {english},
  url       = {http://www.lrec-conf.org/proceedings/lrec2022/workshops/sltat/pdf/2022.sltat-1.15}
}