It was found that while the information coming from the fingers was
reasonably clean because of the automatic calibration that occurred
between each sign, but positional information (in terms of x, y and z)
was often found to ``glitch''; that is, to change to values that were
not just randomly noisy, but physically impossible. There was of
course much noise on the position (as would be expected with 8-bit
resolution), but most of it was no more than 1 to 2 centimetres of
displacement. Occasionally, however, the data returned would indicate
that the hand had moved 40 or so centimetres in 1/25th of a second,
which would hardly ever occur in the course of normal
signing
.
There are several causes for this:
Steps were taken to reduce the effect of noise. For example, the
environment was arranged so that the walls were at a
angle so that reflections were not likely to interfere. The walls were
also covered with a thin material which reduces the intensity of the
reflected signals. Tests with the fluorescent lights on and off showed
that they had little effect on performance.
To remove these glitches a heuristic method was adopted. This is best
expressed in a mathematical notation
.
Each file analysed consists of a sequence of frames. In each frame there is information about x, y and z position, wrist rotation and finger bend.
Let
be the x-position, y-position and z-position
stored in the ith frame of the sample.
Then we can define the change in direction from the last frame as:

The vector
is thus a pointer in
the direction we are going in. Thus if we take the length of this
vector and call it
, i.e.:

then this will give us a measure of speed. In effect this is the first (discrete) derivative with respect to time.
Similarly we define the second derivatives:

Here again, the vector
is a
vector that represent the change in direction that has occurred.
is similarly the length or norm of that vector.
Thus an intuitive way to filter the glitches is:
and
. If
these exceed the practical limits, then average the previous point
with the next point and set the value of this point to that average.
This filter is very empirical and the careful eye will observe some
problems with it. For example, what happens if you get two glitch
readings in a row? Clearly the above algorithm assumes that the next
value after a glitch is a sensible one. It was found empirically that
two glitches in a row rarely occurred. Furthermore, even if it did
occur, the glitches would still be smoothed over, but to a lesser
extent
.
In fact, this can be thought of as nothing more than a specialised threshold filter.
Other filters were considered, such as Kalman filters, moving average filters and so on. However, these were not implemented for a number of reasons:
This was implemented, as was all the feature selections in simple perl scripts.
No major sort of fitering of outlier data samples was taken, although
some of the data that was incorrectly collected (such as one example
where the user had taken a drink and forgotten to press ``B'' to stop
data capture, resulting in a file with over 600 frames in
it)
. This occurred four
time in the course of sampling.
We will now go on to discuss and assess the features that we extract.