next up previous contents
Next: Results Up: 5.3.4 Positional histograms Previous: 5.3.4 Positional histograms

Principle

Anyone acquainted with pattern recognition in any field will have come across histograms. Surprisingly however, their application to gesture recognition is, to the best of my knowledge, novel.

A histogram can be thought of as a discretised probability density function. Basically, you segment the range of possible values into subranges, and then count the number of instances of each subrange.

It is exactly this that we do with the signs and on a number of aspects of a sign. One obvious histogram to take is on the x, y and z positions of the hand.

You have to be a bit clever with how we do this, however, to get useful information. First, if signs are slightly more exaggerated, either in time or space, then we want it to remain reasonably invariant.

Thus we calculate the histograms in the following way: Let d be the number of divisions we wish to divide the ranges into, and let with be the ``columns'' of the histogram. Assume we are doing it for the x-position only. Then:

where

Effectively, this ``normalises'' by two things:

To clarify what a histogram means, a 5-division histogram for the `same' (that we saw in figure 5.1) sign shown in figure 5.2.

  
Figure 5.2: An example of a histogram -- in this case of motion in the x-dimension.

As you can see, there is a strong presence in the middle -- this isn't a surprise, since this is where the sign starts. Also, it is weak to the right, since this seems to be a few points resulting from ``swinging back'' too far on the return. However, on the left hand side, there is another strong peak, which is caused by the slowing down of the motion that occurs in the signgif.

In this set of features, we consider the histograms of the x-position, the y-position and the z-position. Histograms of other data will follow.

There is one question which cannot be answered by theory alone. This question is: What is the optimum number of divisions d that will result in the smallest error? The answer depends on a number of factors, such as the accuracy of the equipment, and the nature of the signs themselves. Too few divisions, and there will be insufficient information to distinguish signs; too many divisions and noise and variation in the data will cause crossover between adjacent columns in the histogram, resulting in erroneous classification.



next up previous contents
Next: Results Up: 5.3.4 Positional histograms Previous: 5.3.4 Positional histograms



waleed@cse.unsw.edu.au