Appendix to the Guidelines: Getting Digital

Introduction

This appendix tries to provide information for people who start using digital video. Information in this appendix is not based on actual research carried out in the Signing Books project but on our own experience with new technologies. We try to address the issues most frequently raised by the users of the Signing Books helpdesk and will provide relevant references.

You might find the information in this appendix useful if you are in one of the following situations:

You have no video equipment, but need to decide how to start.
You have analogue equipment, but you want to produce a CD-ROM or DVD.
You have analogue equipment, and you want to replace some old stuff with new technology.

Many of the persons who contacted the Signing Books helpdesk during the past two years asked us to provide as detailed information as possible, e.g. to give order numbers for specific products. While we were happy to tell them about our experiences with certain equipment, we did not actually recommend specific solutions for a variety of reasons:

Even this special market has become so large that only a couple of people to have a good overview.
Most product-specific information would be out of date at the time it is written: The time on market for most products in the computer industry has fallen to below one year. This is even truer for digital video technology.
The quality of a product highly depends on the quality of support available in a specific region. You may be better off if you choose a product where you can find a regional support and/or an experienced dealer than to purchase cutting-edge technology as an early bird.

The same holds true for this appendix: We will only list specific products if they are representative examples or if they have become de-facto standards.

This text in no way tries to replace books out there in the market. Instead it tries to be a short introduction into the subject with special attention to the target group, i.e. producers of SigningBooks. We suggest that you have a look at the relevant literature on specific topics if you need more information. Keep in mind that there are others who in the past may have faced problems similar to the ones you are currently confronted with and who can share their experiences. One of the most important outcomes of the SigningBooks Symposium in Hamburg in November 1999 was that participants were very open-minded about cooperation. Other players in the SigningBooks domain are rarely seen as competitors, but rather as companions on the way to better products. So make use of this opportunity and contact other people in the field from all over Europe!

The following three chapters describe the technological basics while the final chapter tries to summarise the information in the light of the decision you have to take.

How do the moving pictures get into my computer?

When video technology was born with the definition of television broadcast, transmission technology was based on analogue encoding: a television signal was transmitted using modulation of a radio signal; the higher the signal to be encoded, the "higher" the modulation. This was true for the whole chain from generating the TV picture to the TV set in consumers' households, and it was kept the same when adding pieces like VCRs to the scene.

The digital revolution – with the computer creeping into every machine – only changed part of the story: While some types of equipment became digital inside, video transfer remained analogue until only some years ago (and in most cases is: Digital TV is still in its infancy). For example: in cameras, the electron valves were replaced by sensor (CCD) chips which operate digitally. But then, inside the camera, the digital signal was converted to analogue, to be transferred to the next piece of equipment which may need to return to digital format. This means that a lot of unnecessary conversion takes place until the video images reach the viewer, resulting in a loss of quality. About ten years ago, the first high-end recorders appeared with digital in and out. Unfortunately, vendors did not agree on a common standard, and technology was meant for the high-end market of online and offline editing systems and therefore did not have a major impact on the technology in general. This may now change because IEEE1394 (aka FireWire is becoming more and more popular: there are now "DV" cameras that work digitally from the chips to output, including storage, in a price range close to consumer-level (around 1500 EUR in 12/99). There are also computers that have FireWire-capabilities built-in or that can be equipped with appropriate extra cards for this purpose (starting at about 1400 EUR in 12/99). Connecting these two by a simple cable, you have a relatively low-cost system with transfer quality which three years ago was only achievable in price ranges of at least fifty times higher than today!

From digital equipment

When using FireWire technology, all you need is a cable to plug your camera into your computer and a program to store an incoming video stream to your hard disk. (You might find such a program bundled to a FireWire-equipped computer.) With a data rate of 25 MBit/s (or even 50 MBit/s for DV50), it is quite easy to fill up your hard drives. So keep in mind: You need huge hard disks if you want to work with DV input. You will also see that your computer has a hard time playing back DV files as DV decoding is computing power intensive. If you plan to make the video footage available on CD-ROM or DVD, you will have to transform it to another format (cf. the heading "Compression" in the next chapter).

If you want to edit your DV footage and then transfer it back to DV cassettes, leave it as it is.

From analogue equipment

To transfer analogue video to your computer you need a so-called digitizer board. (Actually, a digitizer may be built into your computer, or it may be a separate box connected via USB or SCSI.) The digitizer transfers analogue video which comes in via a video cable to digital format.

The quality of the video in your computer depends on three factors:

The quality of the source material needs to be as good as possible. The higher the tape quality, the better your results will be, i.e. S-VHS is preferred over VHS, and Component Signal is preferred over S-VHS.
The digitizer board plays an important role. You should not necessarily buy the cheapest board you can find. Also keep in mind that different video systems require different input jacks for the digitizer. If you have a high-end BetacamSP recorder and your card can only input S-VHS, you will waste quality.
The compression you choose for the material to go through after digitisation and before the material is stored on the hard drive is very important. An uncompressed full-format video stream would result in a data rate of 250 MBit/s. Even if your computer were fast enough to handle that much data, your hard drives would be full in seconds. Therefore, most digitizer boards have compression hardware on board, with M-JPEG being the most popular format.

Working with large amounts of video clips

If you plan to work with large numbers of videos to be digitised, it is very convenient if you can enter an edit-list (time codes for beginning and end of each clip) and then ask the computer to process that list. The computer then tells the VCR to jump to the required position on tape, and then records the incoming video until the end is reached. With FireWire, you only need software to handle this process as the required connection is built into the FireWire protocol. With other technologies, you do not only need the appropriate software, but also hardware to command your VCR. In many cases, a simple RS-422 cable is all you need, but there are a variety of implementations on the market, and you have to check what fits to the VCR you have.

How can I manipulate digital video?

When you have finally managed to get your video into your computer, what do you do with it except playing it back to verify that all went well? Well, you may not even be able to play back your movie before compressing it. Anyway, you probably do not want to keep the movie as it is, but apply changes.

Editing

Editing means cutting out pieces, rearranging stuff, adding stuff you already have lying around somewhere, adding titles, effects, etc. For this purpose, you need an editing software that allows you to manipulate the digital video files as as you like. As usual, you will find simple programs basically for free and highly sophisticated programs for a lot of money.

Cropping and Aspect Ratio Correction

Unless you want to output the material to where it came from, you will probably need to cut off parts of the original to make it fit the width-to-height ratio of the target system, e.g.: when going onto DVD, you can choose between ratios of 4:3 and 16:9. If you do not cut (crop) your movie, it will probably appear distorted. But even when going for a CD-ROM or some other computer-based target, you might want to cut off some rows of the input video signal as the bottom-most lines of most video signals contain nothing but noise. DV poses an additional problem: When going from DV to computer-based display, you need to compensate for the non-square pixels that DV uses. While these formats look normal on a television screen, they do not on computer screens that can only display square pixels, resulting in a display that is too wide. Special software is needed to compensate this problem, resulting in a slight blur in the target image.

Compressing

Regardless whether the video comes from digital or analogue sources, it is kept in files on your computer. In most cases, the file format is a wrapper around the actual data that is the compressed form of a sequence of images (frames). If you have PAL or SECAM input, you should end up with 25 frames per second, each of them containing approx. 1MByte of uncompressed data. This demonstrates how important compression is: One single minute of uncompressed video frames would take up 1.5 GByte on your hard disk. Compression can bring down these data rates by a factor of 40, ending at less than 40 MByte per minute.

Almost all compressors used in video technology are "lossy", i.e. they modify the picture in a way that makes compression easier which results in a loss in quality. In general, the higher the compression factor, the more quality you loose. However, compressors differ in what aspects of the movie they pay attention to. Some compressors reduce the number of colours in the image, some introduce pixel noise, some take forever to do their job. It is a whole science for itself which compressor to choose for a certain task. The best advice is: Try out yourself with your own material!

A data rate achieved by compression might look a lot better than without compression, but is still far too much for many applications. (Consider that a CD-ROM could only contain 15 minutes of video with a data rate of 40 MByte per minute, taking for granted that data can be read off the CD fast enough.) To further reduce the data rate, you can choose to reduce the size of the image, or the number of frames per second during the compression process. (That is why in the beginnings of digital video one always found these stamp-size movies.) E.g. by reducing height and width of the image to one half each and going for 12.5 frames per second instead of 25, you typically reduce the data rate by a factor of 8. As with the choice of the compressor, there is a trade-off you have to decide about in view of your own application context.

At http://www.codeccentral.com, maintained by Terran, Inc., you find a lot of information on when to choose which compressor. The web pages show different demos that can help you to decide which compressor works best with your type of video. However, if you are planning to output your digital video to video cassettes, be it in digital or analogue format, your main concern is not to make the video as small as possible, but to keep the quality loss to a minimum in order to achieve an output on tape that is not recognisably different from what you had captured into your computer earlier on. Therefore, you would choose a compression that allows you to fit all the material you need on your hard disks and maybe to have good playback quality, but not going beyond that.

If you are going to produce a DVD, you have the choice between MPEG-I and MPEG-II, where MPEG-II will be the preferred option in most cases. Also, a lot of the compression details will be dictated by the DVD standard.

While some high-end video editing software packages have proprietary formats or rely on standard file formats (such as OMFI, cf. http://www.avid.com/3rdparty/omfi/) for storing compressed video, the majority of programs use the file formats provided by one of the major video architectures, such as QuickTime, RealSystem G2, or DirectShow and Windows Media Technologies (as successors of AVI, and Video for Windows).

Since not all compressors are available for all video architectures, your choice of a compressor may already decide which architecture to use. Or the other way round, if you want to work within a specific architecture (maybe due to its availability to your customers or its compatibility with your editing software), your choice of compressors may already be narrowed down.

Adding navigation capabilities and interactivity

One of the major disadvantages of classical video is its lack of navigation possibilities for the user. In the best case, a videocassette is accompanied by a printed table of contents referring to certain points of time on the videotape by means of time codes. Unfortunately, home video equipment is not very precise when it comes to working with time codes.

Often the user has to rewind or forward the tape until the counter is quite close to the time code given in the table of contents. A number of additional features found in some productions can help you to find out where you are when viewing or searching with picture: Coloured backgrounds matching chapters, the insertion of chapter numbers or chapter headings, etc. Going digital gives you a lot more freedom; nevertheless, video remains a primarily linear medium.

What exactly you have available when using digital media depends on the medium.

On DVD-Video, tables of contents on the medium itself are standard. The user knows which button to press to view the table of contents. The creator of a DVD has some freedom in designing the table of contents. You are not bound to text headings, but instead can use graphics or even moving images to represent each part of the video. Even with the extended possibilities of inserting tables of contents, extra movies, etc, DVD-Video – measured by its usage concept – is only a small step beyond VCRs: DVD-video productions are centered around one video stream visible to the user. The medium does not easily lend itself to simultaneous information presentation (such as a signer overlay) or to cross-links in the video.

On CD-ROM or DVD-ROM, you have the largest degree of freedom. CD-ROM/DVD-ROM productions often have tables of contents; in addition, there are more design capabilities than for DVD-Video. A few CD-ROM productions go beyond that by adding information to the video timing that is displayed concurrently with the video or that allows the user to access external information or to jump to different parts of the video.

Technically, this can be achieved by different architectures:

In QuickTime, the movie author can augment the video by making it "wired movies". Certain areas of the video at certain time intervals can be linked with special actions that are executed when the viewer clicks on the right place at the right point of time. Or the movie can direct a web-browser to show certain URLs at certain points of time, possibly resulting in a movie with background information available from self-created or referenced World Wide Web pages.

In SMIL, an XML application (kind of successor to HTML), several time-based services can be synchronised. An example would be two video streams where one starts when the other one pauses. Thus, you could create a World Wide Web page where two signers take turns in telling a story, without wasting bandwidth by making one large video which shows both signers. Displaying static contents from certain URLs connected to points of time in the video is, of course, also possible. In this architecture, the viewing environment, e.g. the WWW browser, plays the central role, because it starts and stops videos as defined by the SMIL document; the videos themselves remain largely unchanged.

There are different predecessors to this technique of using markers in the video, e.g. by inserting beeps in an inactive audio track.

The technology of wired movies, independent of how they are implemented, allows for interactivity: It is not only the script writer who plans the course of actions, but the user can intervene by selecting from different options, or by answering some kind of multiple-choice questions. To a certain degree, this is also possible with DVD-Video technology. However, as this technology was not tailored to this type of use, most authoring environments are not either. A lot of effort is therefore necessary to accomplish something that is comparatively easy in computer-driven environments.

How do I output digital video?

To analogue video

Until only a couple of years ago, digital movie files were never output to analogue video since the resolution of the files was too low. Instead, working in the computer with a low-quality digital copy produced so-called Edit Decision Lists (cutting instructions) that were handed to a cutting system that mixed pieces from the various sources by controlling one or more VCRs with the original material to copy the right scene at the right point of time to another VCR recording whatever the cutting system mixes in. High-end non-linear systems such as the AVID system still can use this approach, only supplying digital footage if that had not originated from VCR sources or had been modified in the computer. However, most systems can now "print to video", i.e. play the video to a digital-to-analogue converter that can be plugged as source into a VCR. For this to be of reasonable quality, it is essential that the spatial resolution of the digital film equals that of analogue video.

To DV video

In theory, it is quite easy: Press record on your DV equipment, and use your DV editing software to play to DV. In practice, there are two obstacles:

Many DV cameras sold in Europe cannot record from DV input. (In fact, the functionality is there, but is blocked in order to save custom taxes.) So if you are going to purchase a DV camera as the only DV equipment (except the computer), be sure to buy one that can record DV sources.
If, besides rearranging stuff that came from the DV source, you added pictures (e.g. graphics created in the computer) or special effects, these are probably not compressed with the DV compressor. Even if they are, the output to DV may fail if the material in your computer is kind of a patchwork. You may have to "flatten" your DV stream first before it can be successfully played back. For this, however, you may need the same amount of free hard disk space as for your almost-ready material!

To CD-ROM or DVD-ROM or to the Internet

If you had your target delivery system in mind when choosing the size and the compressor, you are probably there when heading for a CD-ROM or DVD-ROM production: You can add the movie films to the resources of the multimedia integration system you work with. But make sure that you test the end-results. Not everything that plays fine from a hard disk also plays fine when coming from CD-ROM. In the early days of CD-ROM the data transfer limitation of around 100 KByte/s was THE limiting factor when deciding about compression. But even today, when most CD drives are at least eight times faster, data rate peaks in the video stream may cause problems. For a movie to become part of a web-site, an additional step may be necessary: For long videos to be downloaded from a web server, users might experience better downloading if you choose a real time streaming protocol (RTSP) instead of HTTP, the standard for web delivery. For optimal delivery via RTSP, hinting is recommended, i.e. adding markers that tell the server in advance what is coming next. This hinting process can be done in most up-to-date video compression programs.

To DVD-Video

A DVD playable in a standard DVD player is nothing but a DVD-ROM with special contents as the contents needs to be played back by a program the functionality of which has been defined some time ago. For example, this means that all video files need to be in either MPEG1 or MPEG2 format. As MPEG, especially MPEG2, is not very handy for editing, you probably have worked with another compression scheme. Therefore, the whole video contents need to be converted to MPEG2 in its DVD-variant. You often find the necessary converter as part of DVD authoring environments that you use to create tables of contents, define what to happen if the user presses a button on the remote control, etc. As MPEG2 encoding is very computing-intensive, it is recommendable to have a hardware-supported solution if you plan to produce DVDs regularly.

What is special for sign language productions?

Bluescreen and DV

DV cameras offer a superb quality when compared to similarly priced analogue cameras. On the other hand, DV itself is also a lossy compressed format. For example it uses a technique called colour subsampling 4:2:0 (for PAL, as does MPEG). This can turn out to be a problem with signers when using bluescreen technology: The signer signs in front of a blue background, and the background is later on (in the computer) replaced by still or moving images. The colour ramps resulting from the DV compression at the edge of the signer's body may then be to coarse to faithfully separate the signer from the background, resulting in some artefacts at the border of the signer and the new background in the target material. This is not so much a problem with speaking moderators as other techniques such as soft masks can be used to compensate, but it is with signers since they are moving all the time. The solution is to use the higher quality DV format DV50 with twice the data rate (and, of course, "slightly" higher prices for the cameras).

Audio quality

Audio quality today is not a problem for computers as the data rates needed for excellent sound are negligible when compared to video. For CD-ROM or DVD-ROM you might even go below usual standards to save some bytes which you can then spend on slightly better video as this is more essential in your context. What is annoying with technology such as DV is that you cannot tell your camera not to record sound which means that you have to delete sound tracks over and over again if you are working on a project without sound.

Frame rates and movement reception

One of the common practices to save bandwidth in CD-ROM, DVD-ROM, or Internet applications is to reduce the number of frames on a video from the standard 25 to a substantially lower number. While in many circumstances the impression is good enough with only 8 frames per second, this is not true for sign language with its relatively fast movements. The minimum rate depends, of course, on the signer(s) in front of your camera; but you will probably find that 12.5 frames per second is the minimum before viewers complain that signs are difficult to recognise. But even with this frame rate, some movements do not appear as "smooth" as they should. In the literature, you often find the recommendation to use 15 frames per second. However, this is only true for NTSC with a native rate of approximately 30 frames per second (used in America). The idea behind using 12.5 fps for PAL or 15 fps for NTSC is the same: Every second frame is left out. (In the same way, other preferred rates are taking each third or fourth frame.) To achieve higher rates, however, you would have to leave out less than every second frame, resulting in an uneven distribution in time of the remaining frames. This is also recognised by the viewer as distorting the movements (the so-called "Judder" effect). So basically the choice is 12.5 frames or 25 frames (or something close to that).

Double fields and movement blur

For historical reasons, TV images do not really consist of 25 images per second, but instead of 50 half images per second that are interlaced: At one point of time, all odd lines are transmitted (the odd field), and 1/50 of a second later, the even lines are transmitted (the even field). This means that there is a time difference between two adjacent lines of a frame as you see it on a computer monitor (non-interlaced). With quick movements, this can easily detected from the zigzag lines showing that the object has moved during that 1/50 of a second. And: This clearly shows with signing. To avoid this problem when going from interlaced to non-interlaced, it is the easiest to simply throw out one field and to only work with the other one. Of course, this costs spatial resolution, i.e. quality. And this explains why quarter-screen movies are quite common on CD-ROMs: They are the largest format that can be squeezed out of an interlaced format by throwing away one field.

Even modern DV technology works with interlaced frames. However, there is an exception: Some cameras offer "progressive scan" technology which is nothing else but shooting the image in non-interlaced mode, i.e. really 25 full frames per second instead of 50 half frames per second. As price differences are not that high, we consider this feature highly recommendable for DV cameras to be used as input for signing CD-ROM or DVD-productions.

Subtitling

Even in signing books productions, subtitles may play a significant role (cf. the main part of this deliverable). In closed captioning, subtitles are only visible if the user switches them on, while open captioning means that the subtitles are visible all the time which might be annoying for people who cannot or need not make use of them.

While closed captioning is quite common in television, it is relatively new for video, and many VCRs are still not capable of handling closed captioning. Keep this in mind when planning for your target group.

On DVD, on the other hand, subtitling is a standard user-selectable feature. In many productions, the user can choose between dozens of languages for the subtitles, sometimes including simplified language.

The production of subtitles for DVD is as simple as it is for the major computer video architectures: You create a text file associating time codes with specific parts of the text, where necessary, formatting options are added. This file is then imported into the movie as an additional text track. So if it makes sense for your production, do not leave out subtitles: they are not too much work (assuming that you have a text version of your script), and you do not need any additional software.

For computer video architecture you have an extra option: As you are free in the video frame aspect ratio, the subtitles need not overlay the video images but can be outside.

How shall I decide?

Now that you have found your way through this appendix, you still do not know what to buy? We tried to give you some background knowledge on which you may base your decisions. In the usual case of limited budgets, you will still need to decide where to trade off. Only you can do that because you know what you want to produce and what will be most important in your future productions. And whatever you then decide to buy, be assured that there will be a better solution available only three months later. But do not blame yourself for that; this is the current speed of technological innovation.

Where can I find more information?

QuickTime: http://www.apple.com/quicktime

MPEG: http://www.mpeg.org

RealSystem G2 http://www.terran.com/CodecCentral/Architectures/RealMedia.html

DirectShow and WindowsMedia Technologies: http://www.microsoft.com/windows/windowsmedia

FireWire: http://www.apple.com/firewire

SMIL: http://www.w3c.org/AudioVideo

XML: http://www.w3c.org/XML

HTML: http://www.w3c.org/MarkUp

Everything you ever wanted to know about DV: http://www.computervice.com/DV-L

Everything you ever wanted to know about DVD: http://www.dvdresource.com/dvdfaq/dvdfaq.shtml

Compression schemes and computer video architectures: http://www.terran-int.com/CodecCentral/

On many of these subjects, you can easily find dozens of books. When you think about buying one, keep in mind how fast technology moves on: The book should not be older than approximately a year.

None of these sites can tell you how to become a good cutter. This is an area where consulting one of the many books on the market is the second best solution after being taught by someone who knows…