Change search
ReferencesLink to record
Permanent link

Direct link
On Concentration of Discrete Distributions with Applications to Supervised Learning of Classifiers
Department of Mathematics, Linköpings University.ORCID iD: 0000-0003-1489-8512
Department of Mathematics, Linköpings University.
2007 (English)In: Machine Learning and Data Mining in Pattern Recognition, Proceedings / [ed] Perner, Petra, Berlin: Springer Berlin/Heidelberg, 2007, 2-16 p.Conference paper (Refereed)
Abstract [en]

Computational procedures using independence assumptions in various forms are popular in machine learning, although checks on empirical data have given inconclusive results about their impact. Some theoretical understanding of when they work is available, but a definite answer seems to be lacking. This paper derives distributions that maximizes the statewise difference to the respective product of marginals. These distributions are, in a sense the worst distribution for predicting an outcome of the data generating mechanism by independence. We also restrict the scope of new theoretical results by showing explicitly that, depending on context, independent ('Naïve') classifiers can be as bad as tossing coins. Regardless of this, independence may beat the generating model in learning supervised classification and we explicitly provide one such scenario.

Place, publisher, year, edition, pages
Berlin: Springer Berlin/Heidelberg, 2007. 2-16 p.
, Lecture Notes in Computer Science, ISSN 0302-9743 ; 4571
Keyword [en]
independence, classification, supervised learning, pattern recognition, prediction
National Category
Other Computer and Information Science
URN: urn:nbn:se:kth:diva-91163ISI: 000248523200001ISBN: 978-3-540-73498-7OAI: diva2:508375
5th International Conference on Machine Learning and Data Mining in Pattern Recognition. Leipzig, GERMANY. JUL 18-20, 2007
QC 20120313Available from: 2012-03-08 Created: 2012-03-08 Last updated: 2012-03-13Bibliographically approved

Open Access in DiVA

No full text

Other links

Search in DiVA

By author/editor
Koski, Timo
Other Computer and Information Science

Search outside of DiVA

GoogleGoogle Scholar
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

Total: 14 hits
ReferencesLink to record
Permanent link

Direct link