kth.sePublications KTH
Change search
Link to record
Permanent link

Direct link
Publications (1 of 1) Show all publications
de la Rua Martinez, J., Buso, F., Kouzoupis, A., Ormenisan, A. A., Niazi, S., Bzhalava, D., . . . Dowling, J. (2024). The Hopsworks Feature Store for Machine Learning. In: SIGMOD-Companion 2024 - Companion of the 2024 International Conferaence on Management of Data: . Paper presented at 2024 International Conferaence on Management of Data, SIGMOD 2024, Santiago, Chile, Jun 9 2024 - Jun 15 2024 (pp. 135-147). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>The Hopsworks Feature Store for Machine Learning
Show others...
2024 (English)In: SIGMOD-Companion 2024 - Companion of the 2024 International Conferaence on Management of Data, Association for Computing Machinery (ACM) , 2024, p. 135-147Conference paper, Published paper (Refereed)
Abstract [en]

Data management is the most challenging aspect of building Machine Learning (ML) systems. ML systems can read large volumes of historical data when training models, but inference workloads are more varied, depending on whether it is a batch or online ML system. The feature store for ML has recently emerged as a single data platform for managing ML data throughout the ML lifecycle, from feature engineering to model training to inference. In this paper, we present the Hopsworks feature store for machine learning as a highly available platform for managing feature data with API support for columnar, row-oriented, and similarity search query workloads. We introduce and address challenges solved by the feature stores related to feature reuse, how to organize data transformations, and how to ensure correct and consistent data between feature engineering, model training, and model inference. We present the engineering challenges in building high-performance query services for a feature store and show how Hopsworks outperforms existing cloud feature stores for training and online inference query workloads.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Series
Proceedings of the ACM SIGMOD International Conference on Management of Data, ISSN 0730-8078
Keywords
arrow flight, duckdb, feature store, mlops, rondb
National Category
Computer Sciences Computer Systems
Identifiers
urn:nbn:se:kth:diva-348769 (URN)10.1145/3626246.3653389 (DOI)001267334100014 ()2-s2.0-85196429961 (Scopus ID)
Conference
2024 International Conferaence on Management of Data, SIGMOD 2024, Santiago, Chile, Jun 9 2024 - Jun 15 2024
Note

QC 20240628

Part of ISBN 979-840070422-2

Available from: 2024-06-27 Created: 2024-06-27 Last updated: 2025-12-05Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-4486-5343

Search in DiVA

Show all publications