Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Reducing high-dimensional data by principal component analysis vs. random projection for nearest neighbor classification
KTH, School of Information and Communication Technology (ICT), Computer and Systems Sciences, DSV.
KTH, School of Information and Communication Technology (ICT), Computer and Systems Sciences, DSV.
2006 (English)In: Publications of the Finnish Artificial Intelligence Society, 2006, 23-30 p.Conference paper, Published paper (Refereed)
Abstract [en]

The computational cost of using nearest neighbor classification often prevents the method from being applied in practice when dealing with high-dimensional data, such as images and micro arrays. One possible solution to this problem is to reduce the dimensionality of the data, ideally without loosing predictive performance. Two different dimensionality reduction methods, principal component analysis (PCA) and random projection (RP), are compared w.r.t. the performance of the resulting nearest neighbor classifier on five image data sets and two micro array data sets. The experimental results show that PCA results in higher accuracy than RP for all the data sets used in this study. However, it is also observed that RP generally outperforms PCA for higher numbers of dimensions. This leads to the conclusion that PCA is more suitable in time-critical cases (i.e., when distance calculations involving only a few dimensions can be afforded), while RP can be more suitable when less severe dimensionality reduction is required. In 6 respectively 4 cases out of 7, the use of PCA and RP even outperform using the non-reduced feature set, hence not only resulting in more efficient, but also more effective, nearest neighbor classification.

Place, publisher, year, edition, pages
2006. 23-30 p.
Series
Publications of the Finnish Artificial Intelligence Society, ISSN 1796-623X
Keyword [en]
Computational costs, Data sets, Dimensionality reduction, Dimensionality reduction method, Distance calculation, Feature sets, High dimensional data, Image datasets, Nearest neighbor classification, Nearest Neighbor classifier, Possible solutions, Predictive performance, Random projections, Artificial intelligence, Principal component analysis
National Category
Computer Science
Identifiers
URN: urn:nbn:se:kth:diva-155009Scopus ID: 2-s2.0-84862522220ISBN: 9525677001 (print)ISBN: 9789525677003 (print)OAI: oai:DiVA.org:kth-155009DiVA: diva2:761005
Conference
9th Scandinavian Conference on Artificial Intelligence, SCAI 2006, 25 October 2006 through 27 October 2006, Espoo
Note

QC 20141105

Available from: 2014-11-05 Created: 2014-10-29 Last updated: 2016-12-08Bibliographically approved

Open Access in DiVA

No full text

Scopus

Search in DiVA

By author/editor
Deegalla, SampathBoström, Henrik
By organisation
Computer and Systems Sciences, DSV
Computer Science

Search outside of DiVA

GoogleGoogle Scholar

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 43 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf