next up previous
Next: CONCLUSIONS & FUTURE WORK Up: Learning Comprehensible Descriptions of Previous: COMPREHENSIBILITY

RELATED WORK

The best developed technique for temporal
classification is the hidden Markov model (see [Rabiner and Juang, 1986] for more details). HMMs have proved themselves to be very useful for speech recognition, and are the basis of most commercial systems. Despite this, they do suffer some serious drawbacks for general use. Firstly, the structure of the HMM - similar to that of a finite state machine - needs to be specified a priori. The structure selected can have a critical effect on performance. Secondly, extracting comprehensible rules from HMMs is not at all easy. Thirdly, even making some very strong assumptions about the probability model, there are frequently hundreds or thousands of parameters per HMM. As a result, many training instances are required to learn effectively. None of these three problems is of primary importance in the speech recognition domaingif.

A recent development has been dynamic Bayes networks (DBNs) - a superset of HMMs - for temporal classification tasks. DBNs augment HMMs by allowing for more complex representations of the state space. Zweig and Russell, for example, were able to obtain higher accuracy than conventional hidden Markov models [Zweig and Russell, 1998] on a speech recognition task. The structure of the dynamic Bayes network was informed by physiological models of the vocal system, and while this is feasible for speech recognition, it may not be for other tasks. Friedman et al [Friedman et al., 1998] explore some techniques for learning DBN structure; these techniques are still primitive.

Another technique that has gained some use is recurrent neural networks (RNNs) ([Bengio, 1996] has a good discussion of their application to temporal classification). This method utilises a normal feed-forward neural network, but introduces a ``context layer'' that is fed back to the hidden layer one timestep later. The context layer allows for retention of some state information. RNNs suffer from many of the same problems as HMMs: incomprehensibility, many parameters need to be set with little guidance from theory, and slow learning. In addition, they do not work well for longer sequences of observations [Bengio, 1996].

Some work has also been completed on signals with high-level event sequence descriptions. Rather than representing temporal information as a time-sampled signal, as in this work, temporal information is represented as a set of timestamped events with parameters. This is a higher level of temporal abstraction than is used in this work, but is applicable to many problems, for example network traffic analysis [Mannila et al., 1995] or network failure analysis [Oates et al., 1998]. In such cases, researchers look for sequences of events that cause particular phenomena.

Several temporal expert systems have also been developed. Shahar [Shahar and Musen, 1995] suggests an expert system architecture for knowledge-based temporal abstraction in medical domains. This is extended to a framework for temporal abstraction in his later work [Shahar, 1997]. This framework is more extensive than the one presented here, but has a different focus; which is for building of expert systems. It seems overly complicated for learning purposes. Paliouras [Paliouras, 1997] shows how to automatically refine parameters in an existing temporal expert system, and applies this technique to the analysis of whale songs.

Recently machine learning approaches have showed some promise. Manganaris [Manganaris, 1997] developed a system for supervised classification of univariate (not multivariate, as in this work) signals using piecewise polynomial modelling, and applied it to space shuttle data as well as artificial datasets. Keogh and Pazzani [Keogh and Pazzani, 1998] developed a technique for agglomerative clustering of univariate time series based on enhancing the time series with a line segment representation. By using delay portraits (looking at the relationship between a channel at time t and time t-n), Rosenstein and Cohen [Rosenstein and Cohen, 1998] improve the reliability of robot sensor readings. Das et al [Das et al., 1998] extracts simple rules from univariate datasets.


next up previous
Next: CONCLUSIONS & FUTURE WORK Up: Learning Comprehensible Descriptions of Previous: COMPREHENSIBILITY

Mohammed Waleed Kadous
Wed May 19 20:21:38 EST 1999