Information theoretic learning for pattern classification
Permanent link
https://hdl.handle.net/10037/1773Date
2007-12-17Type
Master thesisMastergradsoppgave
Author
Kvisle Storås, OlaAbstract
This thesis is a study of pattern classification based on information theoretic criteria.
Information theoretic criteria are important measures based on entropy and divergence between data distributions.
First, the basic concepts of pattern classification with the well known Bayes classification rule
as a starting point is discussed.
We discuss how the Parzen window estimator may be used to find good density estimates.
The Parzen window density estimator can be used to estimate
cost functions based on information theoretic criteria.
Furthermore, we explain a model of an information theoretic learning machine.
With cost functions based on information theoretic criteria, we argue that a learning machine potentially
captures much more information about a data set than the traditional mean squared error cost (MSE) function.
We find that there is a geometric link between information theoretic cost functions estimated using
Parzen windowing, and mean vectors in a Mercer kernel feature space.
This link is used to propose and implement different classifiers based on the integrated squared error (ISE)
divergence measure, operating implicitly in a Mercer kernel feature space. We also apply spectral methods to implement
the same ISE classifiers working in approximations of Mercer kernel feature spaces.
We investigate the performance of the classifiers when we weight each data point with the
the inverse of the probability density function at that point.
We find that the ISE classifiers working implicitly in the Mercer kernel feature space performs similar
to a Parzen window based Bayes classifier. Using a weighted inner-product definition gives slightly better results for
some data sets, while for other data sets the classification rates are slightly worse.
When comparing the results between the implicit ISE classifier using unweighted data points and the Parzen window
Bayes classifier, some of the results indicate that the ISE classifier favor the classes with highest entropy.
Publisher
Universitetet i TromsøUniversity of Tromsø
Metadata
Show full item recordCollections
Copyright 2007 The Author(s)
The following license file are associated with this item: