kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Scalable Unsupervised Feature Selection with Reconstruction Error Guarantees via QMR Decomposition
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0002-8044-4773
SEB Group, Stockholm, Sweden.
KTH, School of Electrical Engineering and Computer Science (EECS), Intelligent systems, Robotics, Perception and Learning, RPL.ORCID iD: 0000-0003-2965-2953
2024 (English)In: CIKM 2024 - Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, Association for Computing Machinery (ACM) , 2024, p. 3658-3662Conference paper, Published paper (Refereed)
Abstract [en]

Unsupervised feature selection (UFS) methods have garnered significant attention for their capability to eliminate redundant features without relying on class label information. However, their scalability to large datasets remains a challenge, rendering common UFS methods impractical for such applications. To address this issue, we introduce QMR-FS, a greedy forward filtering approach that selects linearly independent features up to a specified relative tolerance, ensuring that any excluded features can be reconstructed from the retained set within this tolerance. This is achieved through the QMR matrix decomposition, which builds upon the well-known QR decomposition. QMR-FS benefits from linear complexity relative to the number of instances and boasts exceptional performance due to its ability to leverage parallelized computation on both CPU and GPU. Despite its greedy nature, QMR-FS achieves comparable classification and clustering accuracies across multiple datasets when compared to other UFS methods, while achieving runtimes approximately 10 times faster than recently proposed scalable UFS methods for datasets ranging from 100 million to 1 billion elements.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM) , 2024. p. 3658-3662
Keywords [en]
feature selection, linear independence, scalability, unsupervised learning
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-357143DOI: 10.1145/3627673.3679994Scopus ID: 2-s2.0-85210013171OAI: oai:DiVA.org:kth-357143DiVA, id: diva2:1918220
Conference
33rd ACM International Conference on Information and Knowledge Management, CIKM 2024, Boise, United States of America, October 21-25, 2024
Note

Part of ISBN 9798400704369

QC 20241205

Available from: 2024-12-04 Created: 2024-12-04 Last updated: 2024-12-05Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Ceylan, CiwanKragic, Danica

Search in DiVA

By author/editor
Ceylan, CiwanKragic, Danica
By organisation
Robotics, Perception and Learning, RPL
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 37 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf