Leena Savolainen, Research Institute for the Languages of Finland, Helsinki

The database system used in the FinSL Dictionary Project

1. Background

Between 1988 and 1998, the Finnish Association of the Deaf run a dictionary project. The project enjoyed the support of the Research Institute for the Languages of Finland, with one of its researchers having worked on the project since 1989. In the course of the project a Finnish Sign Language - Finnish dictionary was produced. The book was published in 1998 and it contains 1219 sign entries.

The first stage of the dictionary work was to produce the main part of the Finnish Sign Language (FinSL) material. We made a list of glossed signs that was tested and approved by a larger working group. Then two of our Deaf employees demonstrated a sequence of sentences with each sign used in different contexts. The citation form of each sign was videotaped along with the contextualized examples. This material totals 38 hours and it has been the basis of our work all through the project.

Once the entries were videotaped we started preparing the Finnish language part: searching for the Finnish equivalents for the signs, translating the example sentences and writing the grammatical and other notes on the signs. The still pictures showing the citation forms of the signs were captured from a videotape and further processed on a computer.

During the first three years we used a normal text editor program for storing the texts. In 1991 we moved over to using a database. For database software we chose the one offered by the Word Perfect program (the DOS version), for a variety of practical reasons. What we wanted from the database was that it would be able to sort the material according to any field, and that the program would be flexible, i.d. we could change the number of the fields and their contents by ourselves.

Our database contains only textual material as it is technically impossible to include pictures or videoclips in it. However, we have used numbers and other codes, which function as links to the signed entries on the videotapes and to the still pictures and graphics on a computer hard disk. In the time that the project was running, database programs and computer technology in general developed considerably, but we did not consider it worthwhile transferring mid-project to a totally different kind of database. Besides, the video material still causes big technical problems; it is simply not practicable to store 38 hours of video material on computer hard disks.

2. Description of the database

Structure

One record in our database contains all the textual material connected to one sign entry in the dictionary. The structure of our database is simple (flat) in that there are no direct links from one record to another within the database. Instead, there are lots of textual cross-references between the records, and the names of the picture and graphics files used in the book's layout are listed in their own fields.

We divided the database into sections comprising 100 records. With this kind of database it reduced the risk of accidentally destroying some part of the material. More cumbersome searches within the database were considered to be a small price to pay for greater security within the system overall.

Field types and types of information registered

As our database is not made for research purposes but for gathering together the material for the dictionary in a handy format, the outcome of our linguistic and lexicographic study was divided into only as many fields as the structure of our dictionary required. As a consequence several different features of the sign or of Finnish words are often defined in one field. In particular, the fields for "grammatical information and other notes on the sign", "the Finnish equivalents of the sign" and "miscellaneous notes" contain information that in a research database would be allocated their own fields.

Each record in our database contains up to 18 fields. The fields can be divided into the following eight field types:

  1. the number of a dictionary entry
  2. the names of the picture files showing the citation form of the sign
  3. miscellaneous notes
  4. codes for search parameters
  5. grammatical and other notes on the sign
  6. Finnish equivalents of the sign
  7. a Finnish translation of a signed example
  8. a signed example sentence notated with Finnish glosses

The fields contain the following types of information:

The database contains 1230 records.

Transcription methods used

In our database we have exploited the transcription methods for sign languages to a limited extend only. In the beginning of the project we notated the citation forms of each of the signs and their variants with HamNoSys, but this proved to be too time consuming.

The mouth patterns, which are of FinSL origin (i.e. not the Finnish mouth patterns) are notated using our own notation system, based on the International Phonetic Alphabet (IPA). We have also occasionally used a Stokoe-based notation written in a normal text font, as a means to refer to a certain handshape.

Compatibility with other databases

Our database is technically simple in that most of the data is in text format - i.e. there are no readymade option boxes to click on. Therefore it is possible to make a simple program which exports and possibly also converts our data to another kind of database. An example of this procedure is what the compositor of our dictionary did: he moved the material from our database to FrameMaker files by using the programming language offered by the FrameMaker program.

However two problems arise when trying to export the data from our database into somebody else's sign language database: part of the information is in Finnish, and some of our fields contain different kinds of information; it is impossible to automatically separate these pieces of information from each other.

Video and still-pictures not integrated

Still pictures or videoclips cannot be included in the type of database we have been using. We did however create separate fields for all the names of the picture and graphics files used in the book's layout, and in addition to this, the number codes of the signed entries on videotapes can be found in their own fields.

The still pictures showing the citation forms of the signs were captured from a videotape and further processed on a computer. The graphics (of handshapes, movements etc.) needed for the book's search system were drawn on paper, scanned and vectorized. The video capturing, image processing and the layout of the book were all carried out on a Macintosh.

On the videotapes we have a) the citation form of the sign and on average four signed example sentences on its use, and b) the citation forms re-videoed from different angles so that clear pictures could be captured for the book. The first shooting of the entries has been done at our office on a SuperVHS tape and the second in a studio on Betacam tapes. To capture the images we used SuperVHS tapes.

An example record from our database.

First each field is explained and then its content is given. The meaning of the sign in question is 'to sleep, to be asleep; to fall asleep'.

Field no. Field description Field contents
1 The number of the entry after arranging the entries according to the principles of the search system in the book. 1028
2 The number of the revised entry (T = the entry has not been re-videotaped in its final form): T-680
3 The number of the first version of the entry (R = raw version): R-811
4 The links to the pictures showing the citation form of the sign. The numbers are the names of the picture files. 1028.1 1028.2
5 Miscellaneous notes on the sign and on the entry as a whole. VALMIS am, ps/pla
SIIRRETTY[nuu]; kun viittoman liike on a) voimakkaampi, b) hitaampi ja sitkoinen tai c) nopeampi ja lopussa edestakainen liike sivulta sivulle, ja kun huulio on esim. [hyy], [yy] tai [B] = nukkuu sikeästi => ei laiteta vielä tähän kirjaan, kaipaa lisätutkimusta; POISTO: Pienessä huoneessa <u>nukkui<u/> parikymmentä henkeä sikin sokin.
6 The codes for the search parameters (handshape, one-/two-handed sign, place of articulation, and movement). The graphics files of the search parameters are named respectively with these numbers. 151 1 201 355
7 Grammatical and other notes on the sign myös kaksikätisenä
8 The Finnish equivalents of the sign. The material between the £ codes explains how the sign should be modified so that it would refer to the meanings listed in the second group of Finnish equivalents. 1. nukkua, olla unessa; nukahtaa, vaipua uneen; @(ylätyylissä:)@ uinua; @(arkikielessä:)@ vetää sikeitä, koisata, koisia
2. nukkua sikeästi, nukkua syvää unta, nukkua kuin tukki £liike on voimakkaampi, ja huulio voi olla [B]£
9 The Finnish translation of the first signed example sentence. The part in between the codes <u> and <u/> is printed in bold and italics in the book. Päiväsaikaan kissa <u>nukkuu<u/> hyvin.
10 Field for Finnish glosses and comments on the signed sentence. The glosses are used as a memory aid when the sentence is videotaped in its final form. This field is usually filled only, if the example is not yet re-videotaped or not videotaped at all. (The Finnish translation of this sentence can be found in field 9.) PÄIVÄLLÄ KISSA NUKKUA-SIKEÄSTI RRRR
11 The Finnish translation of the second signed example sentence Viime aikoina olen <u>nukkunut<u/> hyvin vähän.
12 Field for Finnish glosses and comments on the signed sentence. This field is left empty.
13 The Finnish translation of the third signed example sentence Kahdesta kaveruksesta toinen <u>nukkui<u/> kuin tukki, toinen taas heräili vähän väliä.
14 Field for Finnish glosses and comments on the signed sentence. The Finnish text simply indicates that the beginning of the signed sentence has been cut out. (The Finnish translation of the remainder of the sentence can be found in field 13.) Huom. lausetta on lyhennetty alusta, pois: INTERRAILMATKA EUROOPPA KIERTÄÄ

The record of the example given above as it appears in the book:

3. Future work

During the dictionary project we described carefully the citation form of each sign on paper by using different methods of notation: HamNoSys, Stokoe-based notation, explanations in Finnish and drawings. The final decisions on the citation forms were recorded on a videotape, and the still pictures of the citation forms for the book were also made in strict accordance with these decisions. In addition to the videotapes and the still pictures for the book we still have lots of analyzed data which only exists on paper, and it is our aim to build up a research database and make use of this information. This work will probably get under way in 1999.

In 1999, the dictionary and research team at the Finnish Association of the Deaf will be working on four different projects on FinSL: a CD-ROM on polysynthetic signs (in co-operation with the Research Institute for the Languages of Finland), numerals and their derivatives to be published in a form of a booklet and a videotape, a biology lexicon for the schools for the Deaf and a multimedia dictionary. We have planned to use FileMaker Pro as a database program in each of these projects.

 


I want to thank Anja Malm for her comments on my article and John Calton for revising the text.

List of workshop papers