kth.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Crayfish: Navigating the Labyrinth of Machine Learning Inference in Stream Processing Systems
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0002-8573-0090
Brown University, Brown University.
Boston University, Boston University.
KTH, School of Electrical Engineering and Computer Science (EECS), Computer Science, Software and Computer systems, SCS.ORCID iD: 0000-0002-9351-8508
Show others and affiliations
2024 (English)In: Advances in Database Technology - EDBT, Open Proceedings.org , 2024, Vol. 27, p. 676-689, article id 3Conference paper, Published paper (Refereed)
Abstract [en]

As Machine Learning predictions are increasingly being used in business analytics pipelines, integrating stream processing with model serving has become a common data engineering task. Despite their synergies, separate software stacks typically handle streaming analytics and model serving. Systems for data stream management do not support ML inference out-of-the-box, while model-serving frameworks have limited functionality for continuous data transformations, windowing, and other streaming tasks. As a result, developers are left with a design space dilemma whose trade-offs are not well understood. This paper presents Crayfish, an extensible benchmarking framework that facilitates designing and executing comprehensive evaluation studies of streaming inference pipelines. We demonstrate the capabilities of Crayfish by studying four data processing systems, three embedded libraries, three external serving frameworks, and two pre-trained models. Our results prove the necessity of a standardized benchmarking framework and show that (1) even for serving tools in the same category, the performance can vary greatly and, sometimes, defy intuition, (2) GPU accelerators can show compelling improvements for the serving task, but the improvement varies across tools, and (3) serving alternatives can achieve significantly different performance, depending on the stream processors they are integrated with.

Place, publisher, year, edition, pages
Open Proceedings.org , 2024. Vol. 27, p. 676-689, article id 3
Series
Advances in Database Technology - EDBT, ISSN 2367-2005 ; 27
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:kth:diva-346149DOI: 10.48786/edbt.2024.58Scopus ID: 2-s2.0-85190993856OAI: oai:DiVA.org:kth-346149DiVA, id: diva2:1855934
Conference
27th International Conference on Extending Database Technology, EDBT 2024, Paestum, Italy, Mar 25 2024 - Mar 28 2024
Note

QC 20240507

Part of ISBN:

978-389318091-2, 978-389318094-3, 978-389318095-0

Available from: 2024-05-03 Created: 2024-05-03 Last updated: 2024-05-07Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Horchidan, Sonia-FlorinaCarbone, Paris

Search in DiVA

By author/editor
Horchidan, Sonia-FlorinaCarbone, Paris
By organisation
Software and Computer systems, SCS
Computer Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 159 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf