kth.sePublikationer KTH
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Integrating Logit Space Embeddings for Reliable Out-of-Distribution Detection
KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Programvaruteknik och datorsystem, SCS. Qamcom Research and Technology, AB Kistagången 12, 164 40, Kista, Sweden.ORCID-id: 0000-0003-4984-029X
KTH, Skolan för elektroteknik och datavetenskap (EECS), Datavetenskap, Programvaruteknik och datorsystem, SCS. RISE, Kistagången 16, 164 40, Kista, Sweden.ORCID-id: 0000-0003-4516-7317
2025 (Engelska)Ingår i: Machine Learning, Optimization, and Data Science - 10th International Conference, LOD 2024, Revised Selected Papers, Springer Nature , 2025, s. 255-269Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Deep learning (DL) models have significantly transformed machine learning (ML), particularly with their prowess in classification tasks. However, these models struggle to differentiate between in-distribution (ID) and out-of-distribution (OOD) data at the testing phase. This challenge has curtailed their deployment in sensitive fields like biotechnology, where misidentifying OOD data, such as unclear or unknown bacterial genomic sequences, as known ID classes could lead to dire consequences. To address this, we propose an approach to make DL models OOD-sensitive by exploiting the configuration of the logit space embeddings, into the model’s decision-making process. Leveraging the effect observed in recent studies that there is minimal overlap between the embeddings of ID and OOD data, we use a density estimator to model the ID logit distribution based on the training data. This allows us to reliably flag data that do not match the ID distribution as OOD. Our methodology is designed to be independent of the specific data or model architecture and can seamlessly augment existing trained models without the need to expose them to OOD data. Testing our method on widely recognized image datasets, we achieve leading-edge results, including a substantial 10% enhancement in the area under the receiver operating characteristic curve (AUCROC) on the Google genome dataset.

Ort, förlag, år, upplaga, sidor
Springer Nature , 2025. s. 255-269
Nationell ämneskategori
Datavetenskap (datalogi) Bioinformatik (beräkningsbiologi) Programvaruteknik
Identifikatorer
URN: urn:nbn:se:kth:diva-361973DOI: 10.1007/978-3-031-82484-5_19ISI: 001530956900019Scopus ID: 2-s2.0-105000982628OAI: oai:DiVA.org:kth-361973DiVA, id: diva2:1949646
Konferens
10th International Conference on Machine Learning, Optimization, and Data Science, LOD 2024, Castiglione della Pescaia, Italy, September 22-25, 2024
Anmärkning

Part of ISBN 9783031824838

QC 20250404

Tillgänglig från: 2025-04-03 Skapad: 2025-04-03 Senast uppdaterad: 2025-12-08Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

Komini, VangjushGirdzijauskas, Sarunas

Sök vidare i DiVA

Av författaren/redaktören
Komini, VangjushGirdzijauskas, Sarunas
Av organisationen
Programvaruteknik och datorsystem, SCS
Datavetenskap (datalogi)Bioinformatik (beräkningsbiologi)Programvaruteknik

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 60 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf