Data Mining and Process Modelling using a Bayesian Confidence Propagation Neural Network

Abstract

The aim of this thesis is to describe how a statistically based neural network technology the BCPNN (Bayesian Confidence Propagation Neural Networks) can be used within two different applications, data mining in a huge database and modelling of an industrial process. BCPNN has previously been successfully used within classification tasks like fault diagnosis, pattern recognition and hierarchical clustering analysis.

BCPNN is a neural network model reminding somewhat about Bayesian descision trees which are being used within artificial intelligence systems. As a neural network the BCPNN is rather different from backprop (BP) and other gradient methods. The learning process in BCPNN is based upon calculations of probabilities and dependencies which is often a more or less straight forward process compared to the usually time consuming iterative gradient methods. The interpretation of weight values in a BCPNN is also rather easy compared to interpretation of the weight values within a network which is trained by gradient methods.

When we say process modelling here, this refers to function approximation. A function in the general sense may be considered a spatio-temporal outcome of a spatio-temporal input. Function approximation in this sense is somewhat more complex than the modelling we do in this thesis, as we don't deal with time in those paper where we discuss process modelling. To give a glimpse of the BCPNN being able to deal also with time there are two papers included where we deal with some temporal aspects of BCPNN.

The most important results found in this thesis can be summarized in the following:
We show how a Bayesian Neural Network can be extended to model the uncertainties in the collected statistics to produce outcomes as distributions from two different aspects: uncertainties induced by sampling, which is useful for data mining; uncertainties due to input data distributions, which is useful for process modelling.
We show how complex dependencies can be found within large data sets but still avoiding combinatoric explosion.
We show how these techniques have been turned into a useful tool for real world applications within the drug safety area in particular.
We compare some results of the BCPNN technique with the well established non linear regression technique, BP (back prop networks), for processing modelling, showing that the BCPNN performs at least equally well, but provides extra information about uncertainties of produced outcomes. We present a simple but working method for doing automatic temporal segmentation of data sequences.
We indicate some aspects of temporal tasks for which a predictive Bayesian neural network may be useful. Showing how the connection matrix can be reduced due to regularities in the data.


Author:
Roland Orre
Last modified: Mon Feb 17 03:38:06 CET 2003