Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Reducing high-dimensional data by principal component analysis vs. random projection for nearest neighbor classification
KTH, School of Information and Communication Technology (ICT), Computer and Systems Sciences, DSV.
KTH, School of Information and Communication Technology (ICT), Computer and Systems Sciences, DSV.
2006 (English)In: Proc. Int. Conf. Mach. Learn. Appl. ICMLA, 2006, 245-250 p.Conference paper, Published paper (Refereed)
Abstract [en]

The computational cost of using nearest neighbor classification often prevents the method from being applied in practice when dealing with high-dimensional data, such as images and micro arrays. One possible solution to this problem is to reduce the dimensionality of the data, ideally without loosing predictive performance. Two different dimensionality reduction methods, principle component analysis (PCA) and random projection (RP), are investigated for this purpose and compared w.r.t. the performance of the resulting nearest neighbor classifier on five image data sets and five micro array data sets. The experiment results demonstrate that PCA outperforms RP for all data sets used in this study. However, the experiments also show that PCA is more sensitive to the choice of the number of reduced dimensions. After reaching a peak, the accuracy degrades with the number of dimensions for PCA, while the accuracy for RP increases with the number of dimensions. The experiments also show that the use of PCA and RP may even outperform using the non-reduced feature set (in 9 respectively 6 cases out of 10), hence not only resulting in more efficient, but also more effective, nearest neighbor classification.

Place, publisher, year, edition, pages
2006. 245-250 p.
Series
Proceedings - 5th International Conference on Machine Learning and Applications, ICMLA 2006
Keyword [en]
Data reduction, Image analysis, Microarrays, Principal component analysis, Random processes, Computational cost, Random projection, Classification (of information)
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-155360DOI: 10.1109/ICMLA.2006.43ISI: 000244477800038Scopus ID: 2-s2.0-38449115187ISBN: 0769527353 (print)ISBN: 9780769527352 (print)OAI: oai:DiVA.org:kth-155360DiVA: diva2:762667
Conference
5th International Conference on Machine Learning and Applications, ICMLA 2006, 14-16 December 2006, Orlando, FL, USA
Note

QC 20141112

Available from: 2014-11-12 Created: 2014-11-05 Last updated: 2016-12-08Bibliographically approved

Open Access in DiVA

No full text

Other links

Publisher's full textScopus

Search in DiVA

By author/editor
Deegalla, SampathBoström, Henrik
By organisation
Computer and Systems Sciences, DSV
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 23 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf