Reducing high-dimensional data by principal component analysis vs. random projection for nearest neighbor classification
2006 (English)In: Publications of the Finnish Artificial Intelligence Society, 2006, 23-30 p.Conference paper (Refereed)
The computational cost of using nearest neighbor classification often prevents the method from being applied in practice when dealing with high-dimensional data, such as images and micro arrays. One possible solution to this problem is to reduce the dimensionality of the data, ideally without loosing predictive performance. Two different dimensionality reduction methods, principal component analysis (PCA) and random projection (RP), are compared w.r.t. the performance of the resulting nearest neighbor classifier on five image data sets and two micro array data sets. The experimental results show that PCA results in higher accuracy than RP for all the data sets used in this study. However, it is also observed that RP generally outperforms PCA for higher numbers of dimensions. This leads to the conclusion that PCA is more suitable in time-critical cases (i.e., when distance calculations involving only a few dimensions can be afforded), while RP can be more suitable when less severe dimensionality reduction is required. In 6 respectively 4 cases out of 7, the use of PCA and RP even outperform using the non-reduced feature set, hence not only resulting in more efficient, but also more effective, nearest neighbor classification.
Place, publisher, year, edition, pages
2006. 23-30 p.
, Publications of the Finnish Artificial Intelligence Society, ISSN 1796-623X
Computational costs, Data sets, Dimensionality reduction, Dimensionality reduction method, Distance calculation, Feature sets, High dimensional data, Image datasets, Nearest neighbor classification, Nearest Neighbor classifier, Possible solutions, Predictive performance, Random projections, Artificial intelligence, Principal component analysis
IdentifiersURN: urn:nbn:se:kth:diva-155009ScopusID: 2-s2.0-84862522220ISBN: 9525677001ISBN: 9789525677003OAI: oai:DiVA.org:kth-155009DiVA: diva2:761005
9th Scandinavian Conference on Artificial Intelligence, SCAI 2006, 25 October 2006 through 27 October 2006, Espoo
QC 201411052014-11-052014-10-292014-11-05Bibliographically approved