Change search
ReferencesLink to record
Permanent link

Direct link
On Data Mining and Classification Using a Bayesian Confidence Propagation Neural Network
KTH, Superseded Departments, Numerical Analysis and Computer Science, NADA.
2003 (English)Doctoral thesis, comprehensive summary (Other scientific)
Abstract [en]

The aim of this thesis is to describe how a statisticallybased neural network technology, here named BCPNN (BayesianConfidence Propagation Neural Network), which may be identifiedby rewriting Bayes' rule, can be used within a fewapplications, data mining and classification with credibilityintervals as well as unsupervised pattern recognition.

BCPNN is a neural network model somewhat reminding aboutBayesian decision trees which are often used within artificialintelligence systems. It has previously been success- fullyapplied to classification tasks such as fault diagnosis,supervised pattern recognition, hiearchical clustering and alsoused as a model for cortical memory. The learning paradigm usedin BCPNN is rather different from many other neural networkarchitectures. The learning in, e.g. the popularbackpropagation (BP) network, is a gradient method on an errorsurface, but learning in BCPNN is based upon calculations ofmarginal and joint prob- abilities between attributes. This isa quite time efficient process compared to, for instance,gradient learning. The interpretation of the weight values inBCPNN is also easy compared to many other networkarchitechtures. The values of these weights and theiruncertainty is also what we are focusing on in our data miningapplication. The most important results and findings in thisthesis can be summarised in the following points:

    We demonstrate how BCPNN (Bayesian Confidence PropagationNeural Network) can be extended to model the uncertainties incollected statistics to produce outcomes as distributionsfrom two different aspects: uncertainties induced by sparsesampling, which is useful for data mining; uncertainties dueto input data distributions, which is useful for processmodelling.

    We indicate how classification with BCPNN gives highercertainty than an optimal Bayes classifier and betterprecision than a naïve Bayes classifier for limited datasets.

    We show how these techniques have been turned into auseful tool for real world applications within the drugsafety area in particular.

    We present a simple but working method for doingautomatic temporal segmentation of data sequences as well asindicate some aspects of temporal tasks for which a Bayesianneural network may be useful.

    We present a method, based on recurrent BCPNN, whichperforms a similar task as an unsupervised clustering method,on a large database with noisy incomplete data, but muchquicker, with an efficiency in finding patterns comparablewith a well known (Autoclass) Bayesian clustering method,when we compare their performane on artificial data sets.Apart from BCPNN being able to deal with really large datasets, because it is a global method working on collectivestatistics, we also get good indications that the outcomefrom BCPNN seems to have higher clinical relevance thanAutoclass in our application on the WHO database of adversedrug reactions and therefore is a relevant data mining toolto use on the WHO database.

Artificial neural network, Bayesian neural network, datamining, adverse drug reaction signalling, classification,learning.

Place, publisher, year, edition, pages
Stockholm: Numerisk analys och datalogi , 2003. , viii, 83 p.
Trita-NA, ISSN 0348-2952 ; 0308
Keyword [en]
data mining, bcpnn, classification, neural network
URN: urn:nbn:se:kth:diva-3592ISBN: 91-7283-508-7OAI: diva2:9414
Public defence
NR 20140805Available from: 2003-09-16 Created: 2003-09-16Bibliographically approved

Open Access in DiVA

fulltext(909 kB)1734 downloads
File information
File name FULLTEXT01.pdfFile size 909 kBChecksum SHA-1
Type fulltextMimetype application/pdf

By organisation
Numerical Analysis and Computer Science, NADA

Search outside of DiVA

GoogleGoogle Scholar
Total: 1734 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 1596 hits
ReferencesLink to record
Permanent link

Direct link