kth.sePublikationer KTH
Ändra sökning
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
The Hopsworks Feature Store for Machine Learning
KTH, Skolan för elektroteknik och datavetenskap (EECS). Hopsworks AB, Stockholm, Sweden.
Hopsworks AB, Stockholm, Sweden.
Hopsworks AB, Stockholm, Sweden.
Hopsworks AB, Stockholm, Sweden.
Visa övriga samt affilieringar
2024 (Engelska)Ingår i: SIGMOD-Companion 2024 - Companion of the 2024 International Conferaence on Management of Data, Association for Computing Machinery (ACM) , 2024, s. 135-147Konferensbidrag, Publicerat paper (Refereegranskat)
Abstract [en]

Data management is the most challenging aspect of building Machine Learning (ML) systems. ML systems can read large volumes of historical data when training models, but inference workloads are more varied, depending on whether it is a batch or online ML system. The feature store for ML has recently emerged as a single data platform for managing ML data throughout the ML lifecycle, from feature engineering to model training to inference. In this paper, we present the Hopsworks feature store for machine learning as a highly available platform for managing feature data with API support for columnar, row-oriented, and similarity search query workloads. We introduce and address challenges solved by the feature stores related to feature reuse, how to organize data transformations, and how to ensure correct and consistent data between feature engineering, model training, and model inference. We present the engineering challenges in building high-performance query services for a feature store and show how Hopsworks outperforms existing cloud feature stores for training and online inference query workloads.

Ort, förlag, år, upplaga, sidor
Association for Computing Machinery (ACM) , 2024. s. 135-147
Serie
Proceedings of the ACM SIGMOD International Conference on Management of Data, ISSN 0730-8078
Nyckelord [en]
arrow flight, duckdb, feature store, mlops, rondb
Nationell ämneskategori
Datavetenskap (datalogi) Datorsystem
Identifikatorer
URN: urn:nbn:se:kth:diva-348769DOI: 10.1145/3626246.3653389ISI: 001267334100014Scopus ID: 2-s2.0-85196429961OAI: oai:DiVA.org:kth-348769DiVA, id: diva2:1878679
Konferens
2024 International Conferaence on Management of Data, SIGMOD 2024, Santiago, Chile, Jun 9 2024 - Jun 15 2024
Anmärkning

QC 20240628

Part of ISBN 979-840070422-2

Tillgänglig från: 2024-06-27 Skapad: 2024-06-27 Senast uppdaterad: 2025-12-05Bibliografiskt granskad

Open Access i DiVA

Fulltext saknas i DiVA

Övriga länkar

Förlagets fulltextScopus

Person

de la Rua Martinez, JavierKhazanchi, AyushmanVlassov, VladimirDowling, Jim

Sök vidare i DiVA

Av författaren/redaktören
de la Rua Martinez, JavierKhazanchi, AyushmanVlassov, VladimirDowling, Jim
Av organisationen
Skolan för elektroteknik och datavetenskap (EECS)Programvaruteknik och datorsystem, SCS
Datavetenskap (datalogi)Datorsystem

Sök vidare utanför DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetricpoäng

doi
urn-nbn
Totalt: 134 träffar
RefereraExporteraLänk till posten
Permanent länk

Direktlänk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf