PJM Corpus
The Corpus of Polish Sign Language is a collection of video data from 150 Deaf native signers of Polish Sign Language. The PJM Corpus is based at the Laboratory of Sign Linguistics at the University of Warsaw and led by Paweł Rutkowski. The PJM Corpus project started in 2009 and is still ongoing. The project was able to win Trevor Johnston, creator of the Auslan Corpus, as an external consultant and advisor.
The participants are recorded in pairs in a recording studio. A deaf moderator is leading through the sessions. Tasks for collecting data were borrowed from other SLsign language corpora, for example the DGS Corpus. 24 tasks were used for the elicitation. For recording HD cameras are used.
On basis of the Corpus of PJM the Corpus Dictionary of Polish Sign Language, an online available for free dictionary for PJM including example sentences form the PJM Corpus, is created.
Language | Polish Sign Language |
---|---|
Size | 565 hours recorded, 687971 tokens and 15384 types annotated |
Participants |
150 participants
Deaf 4 age groups: 18–30, 31–45, 46–60, 60 years and older Controlled for age, gender, region, age of acquisition, social background, education |
Metadata Format | not available |
Translation | Polish, 67698 sentences translated |
Annotation |
Segmentation into clause-like units, part of speech, negation, HamNoSys transcript
See Rutkowski et al. (2015), Filipczak (2014) and Kuder et al. (2022) for more information |
Data Format | iLex, YouTrack |
Licence | CC BY-SA 4.0 |
Access |
Open access to a part of the recordings with video, annotation and translation in ELAN format
Restricted access for researchers to further data requires individual license agreement |
Webpages |
Project page: https://www.plm.uw.edu.pl/projekty/korpus-pjm/
Dataset: https://www.korpuspjm.uw.edu.pl/en |
Institution | University of Warsaw |
Publications | https://www.plm.uw.edu.pl/publikacje/ |
Cite as
Joanna Wójcicka, Anna Kuder, Piotr Mostowski, Paweł Rutkowski (eds.), 2020, Open Repository of the Polish Sign Language Corpus, Warsaw: Faculty of Polish Studies, University of Warsaw, ISBN: 978-83-66400-21-4 (online publication: https://www.korpuspjm.uw.edu.pl).
Common tasks used in this corpus
Task | Calendar |
---|---|
# recordings – open access | 60 |
# recordings – restricted access | not available |
Data available | https://www.korpuspjm.uw.edu.pl/en/videos?q=[[13,14,15,16],{}] |
Task | Charlie Chaplin |
# recordings – open access | 71 |
# recordings – restricted access | not available |
Data available | https://www.korpuspjm.uw.edu.pl/en/videos?q=[[1,2,5,6],{}] |
Task | Deaf life experiences |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Task | Describe process |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Task | Diachronic changes |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Task | Fire Alarm Story |
# recordings – open access | 64 |
# recordings – restricted access | not available |
Data available | https://www.korpuspjm.uw.edu.pl/en/videos?q=[[1,2,3,4],{}] |
Task | Free conversation |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Task | Frog Story |
# recordings – open access | 64 |
# recordings – restricted access | not available |
Data available | https://www.korpuspjm.uw.edu.pl/en/videos?q=[[1,10,11,12],{}] |
Task | Jokes |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Task | Lexical elicitation |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Task | Pear Story |
# recordings – open access | 71 |
# recordings – restricted access | not available |
Data available | https://www.korpuspjm.uw.edu.pl/en/videos?q=[[1,2,5,7],{}] |
Task | Route description |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Task | Sign Name |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Task | Signs Movie |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Task | Subject areas |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Task | Sylvester and Tweety |
# recordings – open access | 64 |
# recordings – restricted access | not available |
Data available | https://www.korpuspjm.uw.edu.pl/en/videos?q=[[1,2,8,9],{}] |
Task | Warning and prohibition signs |
# recordings – open access | 71 |
# recordings – restricted access | not available |
Data available | https://www.korpuspjm.uw.edu.pl/en/videos?q=[[13,14,17,18],{}] |
Task | What did you do when it happened |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Task | Your region |
# recordings – open access | 0 |
# recordings – restricted access | not available |
Data available | none |
Articles mentioned above
- Anna Kuder, Joanna Wójcicka, Piotr Mostowski, Paweł Rutkowski (2022). "Open Repository of the Polish Sign Language Corpus: Publication Project of the Polish Sign Language Corpus". In: Proceedings of the LREC2022 10th Workshop on the Representation and Processing of Sign Languages: Multilingual Sign Language Resources (Marseille, France). Ed. by Eleni Efthimiou, Stavroula-Evita Fotinea, Thomas Hanke, Julie A. Hochgesang, Jette Kristoffersen, Johanna Mesch, Marc Schulder. Paris, France: European Language Resources Association (ELRA), pp. 118-123. ISBN: 979-10-95546-86-3. ACL: 2022.signlang-1.18.
- Joanna Filipczak (2014). "Anotacja korpusu PJM [The PJM Corpus Annotation Process]". In: Lingwistyka przestrzeni i ruchu. Komunikacja migowa a metody korpusowe [Linguistics of space and movement. Sign language communication and corpus methods]. Ed. by Paweł Rutkowski, Sylwia Łozińska. Warsaw: Faculty of Polish Studies, University of Warsaw, pp. 91-105.
- Paweł Rutkowski, Joanna Filipczak, Anna Kuder (2015). "PJM Corpus Annotation Guidelines".
This entry was last modified on 30 January 2023.