Edwards, A. D. N. (1997). If the glove fits. Ability (22): pp. 12-13

Progress in sign language recognition

Alistair Edwards
Department of Computer Science
University of York
Heslington
York
England
YO10 5DD

For many deaf people, sign language is the principle means of communication. One problem is that very few people who are not themselves deaf ever learn to sign. This therefore increases the isolation of deaf people; they may be confined in many of their interactions to communicating only with other deaf people. It seems that technology might have a role to play here, if computers could be programmed to recognize sign language and to translate it into another form, such as (synthesized) speech or written text. The technology is now available to attempt this translation ­ but there are considerable problems to be solved before it will become a reality.

In this article I will summarize some of those problems and the progress that is being made to overcome them ­ with particular reference to work we have been doing at the University of York.

The first requirement for sign language recognition is to capture the gestures that are made as signs. Instrumented clothing ­ originally designed for virtual reality applications ­ is available which will provide very accurate data on hand movements. The commonest example is the data glove but the accompanying picture shows a complete jacket. Sensors on the joints of the hands and arms measure the angle of bending of the joints which may be combined with other sensors giving the position in space of the hands. The advantage of such clothing is that the data that it generates is very accurate; the disadvantage is that the clothing is cumbersome and intrusive. Who would want to dress up like that every time they wanted to talk to a non-signing person?

The alternative is to use video images of the signer. This is clearly much less intrusive. However, the computational task of analysing the picture and working out exactly where the hands are is much more difficult. Most current experimental systems rely on help in the form of the signer wearing differently coloured gloves.

Once you have captured the data on the hand positions, shapes and movements, the individual signs have to be recognized. There are two parts to this. First there is the segmentation of the signs; where does one sign end and the next begin? A number of approaches have been applied to this but as yet none has emerged as definitively best.

Having decided which segment of the data represents a single sign, it is necessary to decide which sign it is. This is essentially a pattern matching task and a number of techniques have been applied. Several groups have applied artificial neural networks to this task, with some success while others use Hidden Markov Models, a technique developed within speech recognition.

At the University of York we have been investigating the possibilities using the data jacket, which is manufactured by the Welsh company, TCAS. We have developed a novel segmentation technique, based on modelling the tension in the hand and the acceleration of the hand. For gesture recognition we use artificial neural networks. Using these techniques we have demonstrated that we can recognize a simple class of gestures. We are currently seeking the little bit of funding we need to finish this current phase of the work by demonstrating that we can recognize British Sign Language (BSL) finger spelling. From there we hope to go on to develop the techniques for recognizing the whole set of signs that make up BSL.

We are by no means the only researchers working in this area. In 1996 we organized a workshop on the topic of gesture recognition, including sign language. That was attended by people from all over Europe and was so successful that it was followed this year by another workshop in Germany. It seems that at this stage the most successful sign language recognition system is the one developed at Hitachi's Laboratories in Japan. They have demonstrated the feasibility of sign-to-speech translation for Japanese Sign Language (JSL). However, their system is still somewhat limited; it does not recognize arbitrary JSL utterances, but only those that it has been pre-trained for ­ about 100 in number.

So, advances are being made in capturing and recognizing the gestures of sign language. Unfortunately this does not necessarily mean that the goal of full sign language translation will be achieved soon. I foresee three main challenges beyond the gesture recognition.

First there is the fact that a large component of the meaning of sign language is not conveyed in the signs themselves, but in the way they are made. For instance the speed associated with an object might be communicated in the speed of the hand movement associated with its sign. Related to that is the second problem, which is that there is currently no formal way of representing the grammar of sign languages. For instance the word order of most sign languages is not the same as the spoken language. To say 'My name is...' in BSL one would sign 'My name me...' Without a grammar it would be very difficult to make the translation. Furthermore, any grammar is likely to be very difficult to represent as it will have to capture elements of the three-dimensional space of signs. Finally all the meaning of sign language is not conveyed in the hands. Other parts of the body are involved and in particular the face. There has been some work on capturing and interpreting facial expressions, but as yet it is only possible to recognize categories as broad as 'happy, 'sad' and 'surprised'.

The technology exists for sign language interpretation, but it will be a while yet before the techniques catch up to make it an every day reality.

Further information

The papers presented at the York gestures workshop were published as Harling and Edwards (1996) which is available directly from me at the University of York. The proceedings of the latest workshop have not yet been published, but further details are available on the Web at http://www.TechFak.Uni-Bielefeld.DE/GW97/. Details of the Hitachi sign-to-speech system can be found in Sagawa, Takeuchi and Ohki (1997) .

Harling, P. A. and Edwards, A. D. N., (Eds.) (1996). Progress in Gestural Interaction: Proceedings of Gesture Workshop '96. London, Springer.

Sagawa, H., Takeuchi, M. and Ohki, M. (1997). Description and recognition methods for sign language based on gesture components. in Proceedings of IUI 97, (Orlando, Florida), ACM. pp. 97­104