kth.sePublications
Change search
Link to record
Permanent link

Direct link
Publications (10 of 26) Show all publications
Segeljakt, K., Haridi, S. & Carbone, P. (2024). AquaLang: A Dataflow Programming Language. In: DEBS 2024 - Proceedings of the 18th ACM International Conference on Distributed and Event-Based Systems: . Paper presented at 18th ACM International Conference on Distributed and Event-Based Systems, DEBS 2024, Villeurbanne, France, Jun 25 2024 - Jun 28 2024 (pp. 42-53). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>AquaLang: A Dataflow Programming Language
2024 (English)In: DEBS 2024 - Proceedings of the 18th ACM International Conference on Distributed and Event-Based Systems, Association for Computing Machinery (ACM) , 2024, p. 42-53Conference paper, Published paper (Refereed)
Abstract [en]

Dataflow systems are widely used today for building and running continuous data-intensive applications. However, the unavoidable semantic gap between the host languages of dataflow system libraries and the dataflow model creates programmability limitations that hinder performance, safety, and ease of use. We propose AquaLang, a new language designed for dataflow systems. Programs in AquaLang blend strongly typed relational and functional syntax and are verified using an effect system that prevents undefined behaviour that can occur when introducing user-defined logic that violates dataflow semantics. Unverified external code is also feasible in AquaLang through the novel use of sandboxing. Furthermore, on top of standard dataflow optimisations employed by current systems, AquaLang's ability to analyze algebraic properties of user-defined functions further unlocks the potential of deeper dataflow program re-writing. In our evaluation, we measure up to one order of magnitude speedup for Nexmark queries against hand-written Flink programs attributed to pushdown and window incrementalisation techniques.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Keywords
Data Streams, Dataflow Systems, Programming Languages
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-351925 (URN)10.1145/3629104.3666030 (DOI)001283849100006 ()2-s2.0-85200659561 (Scopus ID)
Conference
18th ACM International Conference on Distributed and Event-Based Systems, DEBS 2024, Villeurbanne, France, Jun 25 2024 - Jun 28 2024
Note

Part of ISBN 9798400704437

QC 20240827

Available from: 2024-08-19 Created: 2024-08-19 Last updated: 2024-09-10Bibliographically approved
Lindén, J., Ermedahl, A., Salomonsson, H., Daneshtalab, M., Forsberg, B. & Carbone, P. (2024). Autonomous Realization of Safety- and Time-Critical Embedded Artificial Intelligence. In: 2024 Design, Automation and Test in Europe Conference and Exhibition, DATE 2024 - Proceedings: . Paper presented at 2024 Design, Automation and Test in Europe Conference and Exhibition, DATE 2024, Valencia, Spain, Mar 25 2024 - Mar 27 2024. Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>Autonomous Realization of Safety- and Time-Critical Embedded Artificial Intelligence
Show others...
2024 (English)In: 2024 Design, Automation and Test in Europe Conference and Exhibition, DATE 2024 - Proceedings, Institute of Electrical and Electronics Engineers (IEEE) , 2024Conference paper, Published paper (Refereed)
Abstract [en]

There is an evident need to complement embedded critical control logic with AI inference, but today's AI-capable hardware, software, and processes are primarily targeted towards the needs of cloud-centric actors. Telecom and defense airspace industries, which make heavy use of specialized hardware, face the challenge of manually hand-tuning AI workloads and hardware, presenting an unprecedented cost and complexity due to the diversity and sheer number of deployed instances. Furthermore, embedded AI functionality must not adversely affect real-time and safety requirements of the critical business logic. To address this, end-to-end AI pipelines for critical platforms are needed to automate the adaption of networks to fit into resource-constrained devices under critical and real-time constraints, while remaining interoperable with de-facto standard AI tools and frameworks used in the cloud. We present two industrial applications where such solutions are needed to bring AI to critical and resource-constrained hardware, and a generalized end-to-end AI pipeline that addresses these needs. Crucial steps to realize it are taken in the industry-academia collaborative FASTER-AI project.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
embedded systems, machine learning
National Category
Computer Systems Software Engineering
Identifiers
urn:nbn:se:kth:diva-350536 (URN)10.23919/DATE58400.2024.10546824 (DOI)001253778900307 ()2-s2.0-85196520555 (Scopus ID)
Conference
2024 Design, Automation and Test in Europe Conference and Exhibition, DATE 2024, Valencia, Spain, Mar 25 2024 - Mar 27 2024
Note

Part of ISBN 978-3-9819263-8-5

QC 20241119

Available from: 2024-07-16 Created: 2024-07-16 Last updated: 2024-11-19Bibliographically approved
Siachamis, G., Psarakis, K., Fragkoulis, M., Van Deursen, A., Carbone, P. & Katsifodimos, A. (2024). CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows. In: Proceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024: . Paper presented at 40th IEEE International Conference on Data Engineering, ICDE 2024, Utrecht, Netherlands, Kingdom of the, May 13 2024 - May 17 2024 (pp. 4030-4043). Institute of Electrical and Electronics Engineers (IEEE)
Open this publication in new window or tab >>CheckMate: Evaluating Checkpointing Protocols for Streaming Dataflows
Show others...
2024 (English)In: Proceedings - 2024 IEEE 40th International Conference on Data Engineering, ICDE 2024, Institute of Electrical and Electronics Engineers (IEEE) , 2024, p. 4030-4043Conference paper, Published paper (Refereed)
Abstract [en]

Stream processing in the last decade has seen broad adoption in both commercial and research settings. One key element for this success is the ability of modern stream processors to handle failures while ensuring exactly-once processing guarantees. At the moment of writing, virtually all stream processors that guarantee exactly-once processing implement a variant of Apache Flink's coordinated checkpoints -an extension of the original Chandy-Lamport checkpoints from 1985. However, the reasons behind this prevalence of the coordinated approach remain anecdotal, as reported by practitioners of the stream processing community. At the same time, common checkpointing approaches, such as the uncoordinated and the communication-induced ones, remain largely unexplored. This paper is the first to address this gap by i) shedding light on why practitioners have favored the coordinated approach and ii) investigating whether there are viable alternatives. To this end, we implement three checkpointing approaches that we surveyed and adapted for the distinct needs of streaming dataflows. Our analysis shows that the coordinated approach outperforms the uncoordinated and communication-induced protocols under uniformly distributed workloads. To our surprise, however, the uncoordinated approach is not only competitive to the coordinated one in uniformly distributed workloads, but it also outperforms the coordinated approach in skewed workloads. We conclude that rather than blindly employing coordinated checkpointing, research should focus on optimizing the very promising uncoordinated approach, as it can address issues with skew and support prevalent cyclic queries. We believe that our findings can trigger further research into checkpointing mechanisms.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers (IEEE), 2024
Keywords
Benchmarking, Checkpointing, Experimental evaluation, Fault tolerance, Stream processing
National Category
Computer Sciences Software Engineering
Identifiers
urn:nbn:se:kth:diva-351951 (URN)10.1109/ICDE60146.2024.00309 (DOI)2-s2.0-85200460157 (Scopus ID)
Conference
40th IEEE International Conference on Data Engineering, ICDE 2024, Utrecht, Netherlands, Kingdom of the, May 13 2024 - May 17 2024
Note

 Part of ISBN 9798350317152

QC 20240827

Available from: 2024-08-19 Created: 2024-08-19 Last updated: 2024-08-27Bibliographically approved
Hasselberg, A., Timoudas, T. O., Carbone, P. & Dán, G. (2024). Cliffhanger: An Experimental Evaluation of Stateful Serverless at the Edge. In: 2024 19th Wireless On-Demand Network Systems and Services Conference: . Paper presented at 19th Wireless On-Demand Network Systems and Services Conference (WONS), JAN 29-31, 2024, Chamonix, FRANCE (pp. 41-48). IEEE
Open this publication in new window or tab >>Cliffhanger: An Experimental Evaluation of Stateful Serverless at the Edge
2024 (English)In: 2024 19th Wireless On-Demand Network Systems and Services Conference, IEEE, 2024, p. 41-48Conference paper, Published paper (Refereed)
Abstract [en]

The serverless computing paradigm has transformed cloud service deployment by enabling automatic scaling of resources in response to varying demand. Building on this, stateful serverless computing introduces critical capabilities for data management, fault tolerance, and consistency, which are particularly relevant in the context of distributed deployments, notably in edge computing environments. In this work, we explore the feasibility of stateful serverless computing in resource-limited edge environments through an empirical study utilizing a multi-view object tracking application. Our results show that while these systems perform well in cloud environments, their effectiveness is severely affected at the edge due to state, application, and resource management solutions optimized for cloud environments. Existing solutions are most detrimental to applications with intermittent workloads, as typical combinations of concurrency handling and resource reservation can lead to minutes of unstable system behavior due to cold starts. Our results highlight the need for a tailored approach in stateful serverless systems for edge computing scenarios.

Place, publisher, year, edition, pages
IEEE, 2024
Series
Annual Conference on Wireless On Demand Network Systems and Services, ISSN 2688-4917
Keywords
Distributed computing, Edge Computing, Fog computing
National Category
Computer Systems
Identifiers
urn:nbn:se:kth:diva-345136 (URN)001181198900007 ()
Conference
19th Wireless On-Demand Network Systems and Services Conference (WONS), JAN 29-31, 2024, Chamonix, FRANCE
Note

QC 20240408

Part of ISBN 978-3-903176-61-4

Available from: 2024-04-08 Created: 2024-04-08 Last updated: 2024-04-08Bibliographically approved
Horchidan, S.-F., Chen, P. H., Kritharakis, E., Carbone, P. & Kalavri, V. (2024). Crayfish: Navigating the Labyrinth of Machine Learning Inference in Stream Processing Systems. In: Advances in Database Technology - EDBT: . Paper presented at 27th International Conference on Extending Database Technology, EDBT 2024, Paestum, Italy, Mar 25 2024 - Mar 28 2024 (pp. 676-689). Open Proceedings.org, 27, Article ID 3.
Open this publication in new window or tab >>Crayfish: Navigating the Labyrinth of Machine Learning Inference in Stream Processing Systems
Show others...
2024 (English)In: Advances in Database Technology - EDBT, Open Proceedings.org , 2024, Vol. 27, p. 676-689, article id 3Conference paper, Published paper (Refereed)
Abstract [en]

As Machine Learning predictions are increasingly being used in business analytics pipelines, integrating stream processing with model serving has become a common data engineering task. Despite their synergies, separate software stacks typically handle streaming analytics and model serving. Systems for data stream management do not support ML inference out-of-the-box, while model-serving frameworks have limited functionality for continuous data transformations, windowing, and other streaming tasks. As a result, developers are left with a design space dilemma whose trade-offs are not well understood. This paper presents Crayfish, an extensible benchmarking framework that facilitates designing and executing comprehensive evaluation studies of streaming inference pipelines. We demonstrate the capabilities of Crayfish by studying four data processing systems, three embedded libraries, three external serving frameworks, and two pre-trained models. Our results prove the necessity of a standardized benchmarking framework and show that (1) even for serving tools in the same category, the performance can vary greatly and, sometimes, defy intuition, (2) GPU accelerators can show compelling improvements for the serving task, but the improvement varies across tools, and (3) serving alternatives can achieve significantly different performance, depending on the stream processors they are integrated with.

Place, publisher, year, edition, pages
Open Proceedings.org, 2024
Series
Advances in Database Technology - EDBT, ISSN 2367-2005 ; 27
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-346149 (URN)10.48786/edbt.2024.58 (DOI)2-s2.0-85190993856 (Scopus ID)
Conference
27th International Conference on Extending Database Technology, EDBT 2024, Paestum, Italy, Mar 25 2024 - Mar 28 2024
Note

QC 20240507

Part of ISBN:

978-389318091-2, 978-389318094-3, 978-389318095-0

Available from: 2024-05-03 Created: 2024-05-03 Last updated: 2024-05-07Bibliographically approved
Meldrum, M. & Carbone, P. (2024). μWheel: Aggregate Management for Streams and Queries. In: DEBS 2024 - Proceedings of the 18th ACM International Conference on Distributed and Event-Based Systems: . Paper presented at 18th ACM International Conference on Distributed and Event-Based Systems, DEBS 2024, Villeurbanne, France, Jun 25 2024 - Jun 28 2024 (pp. 54-65). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>μWheel: Aggregate Management for Streams and Queries
2024 (English)In: DEBS 2024 - Proceedings of the 18th ACM International Conference on Distributed and Event-Based Systems, Association for Computing Machinery (ACM) , 2024, p. 54-65Conference paper, Published paper (Refereed)
Abstract [en]

Aggregate management is equally significant for both streaming and query workloads. However, the prevalent approach of separating stream processing and query analysis impairs performance, hinders aggregate reuse, increases resource demands, and lowers data freshness. μWheel addresses this problem by unifying aggregate management needs within a single system optimized for continuous event streams. μWheel pre-aggregates and indexes timestamped data arriving out-of-order, enabling the sharing of aggregates across arbitrary time intervals while respecting low watermarks. Our performance analysis demonstrates that μWheel dramatically outperforms current aggregate sharing techniques for high-volume streaming, particularly when handling numerous concurrent window slides. Crucially, μWheel also delivers performance comparable to specialized pre-aggregation indexes for supporting ad-hoc queries and does so with significantly reduced storage requirements. μWheel's efficiency stems from its compact wheel-based data layout, featuring implicit timestamps, a query-agnostic time hierarchy, and a query optimizer designed to minimize aggregate operations.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2024
Keywords
aggregate management, embedded analytics, stream processing
National Category
Computer Sciences Communication Systems
Identifiers
urn:nbn:se:kth:diva-351923 (URN)10.1145/3629104.3666031 (DOI)2-s2.0-85200696289 (Scopus ID)
Conference
18th ACM International Conference on Distributed and Event-Based Systems, DEBS 2024, Villeurbanne, France, Jun 25 2024 - Jun 28 2024
Note

Part of ISBN 9798400704437

QC 20240910

Available from: 2024-08-19 Created: 2024-08-19 Last updated: 2024-09-10Bibliographically approved
Ng, H., Haridi, S. & Carbone, P. (2023). Omni-Paxos: Breaking the Barriers of Partial Connectivity. In: PROCEEDINGS OF THE EIGHTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS, EUROSYS 2023: . Paper presented at 18th European Conference on Computer Systems (EuroSys), MAY 08-12, 2023, Sapienza Univ Rome, Rome, ITALY (pp. 314-330). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Omni-Paxos: Breaking the Barriers of Partial Connectivity
2023 (English)In: PROCEEDINGS OF THE EIGHTEENTH EUROPEAN CONFERENCE ON COMPUTER SYSTEMS, EUROSYS 2023, Association for Computing Machinery (ACM) , 2023, p. 314-330Conference paper, Published paper (Refereed)
Abstract [en]

Omni-Paxos is a system for state machine replication that is completely resilient to partial network partitions, a major source of service disruptions in recent years. Omni-Paxos achieves its resilience through a decoupled design that separates the execution and state of leader election from log replication. The leader election builds on the concept of quorum-connected servers, with the sole focus on connectivity. Additionally, by decoupling reconfiguration from log replication, Omni-Paxos provides flexible and parallel log migration that improves the performance and robustness of reconfiguration. Our evaluation showcases two benefits over state-of-the-art protocols: (1) guaranteed recovery in at most four election timeouts under extreme partial network partitions, and (2) up to 8x shorter reconfiguration periods with 46% less I/O at the leader.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2023
Keywords
consensus, state machine replication, partial connectivity, reconfiguration
National Category
Computer Engineering
Identifiers
urn:nbn:se:kth:diva-338157 (URN)10.1145/3552326.3587441 (DOI)001062106700020 ()2-s2.0-85160212180 (Scopus ID)
Conference
18th European Conference on Computer Systems (EuroSys), MAY 08-12, 2023, Sapienza Univ Rome, Rome, ITALY
Note

Part of proceedings ISBN 978-1-4503-9487-1

QC 20231016

Available from: 2023-10-16 Created: 2023-10-16 Last updated: 2023-10-16Bibliographically approved
Horchidan, S. & Carbone, P. (2023). ORB: Empowering Graph Queries through Inference. In: ESWC-JP 2023: Joint Proceedings of the ESWC 2023 Workshops and Tutorials, co-located with 20th European Semantic Web Conference, ESWC 2023. Paper presented at Joint of the 20th European Semantic Web Conference - Workshops and Tutorials, ESWC-JP 2023, Hersonissos, Greece, May 28 2023 - May 29 2023. CEUR-WS
Open this publication in new window or tab >>ORB: Empowering Graph Queries through Inference
2023 (English)In: ESWC-JP 2023: Joint Proceedings of the ESWC 2023 Workshops and Tutorials, co-located with 20th European Semantic Web Conference, ESWC 2023, CEUR-WS , 2023Conference paper, Published paper (Refereed)
Abstract [en]

Executing queries on incomplete, sparse knowledge graphs yields incomplete results, especially when it comes to queries involving traversals. In this paper, we question the applicability of all known architectures for incomplete knowledge bases and propose ORB: a clear departure from existing system designs, relying on Machine Learning-based operators to provide inferred query results. At the same time, ORB addresses peculiarities inherent to knowledge graphs, such as schema evolution, dynamism, scalability, as well as high query complexity via the use of embedding-driven inference. Through ORB, we stress that approximating complex processing tasks is not only desirable but also imperative for knowledge graphs.

Place, publisher, year, edition, pages
CEUR-WS, 2023
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-336774 (URN)2-s2.0-85168679093 (Scopus ID)
Conference
Joint of the 20th European Semantic Web Conference - Workshops and Tutorials, ESWC-JP 2023, Hersonissos, Greece, May 28 2023 - May 29 2023
Note

QC 20230920

Available from: 2023-09-20 Created: 2023-09-20 Last updated: 2023-09-20Bibliographically approved
Spenger, J., Huang, C., Haller, P. & Carbone, P. (2023). Portals: A Showcase of Multi-Dataflow Stateful Serverless. In: Proceedings 49th International Conference on Very Large Data Bases, VLDB 2023: . Paper presented at 49th International Conference on Very Large Data Bases, VLDB 2023, Vancouver, Canada, Aug 28 2023 - Sep 1 2023 (pp. 4054-4057). Association for Computing Machinery (ACM)
Open this publication in new window or tab >>Portals: A Showcase of Multi-Dataflow Stateful Serverless
2023 (English)In: Proceedings 49th International Conference on Very Large Data Bases, VLDB 2023, Association for Computing Machinery (ACM) , 2023, p. 4054-4057Conference paper, Published paper (Refereed)
Abstract [en]

Serverless applications spanning the cloud and edge require flexible programming frameworks for expressing compositions across the different levels of deployment. Another critical aspect for applications with state is failure resilience beyond the scope of a single dataflow graph that is the current standard in data streaming systems. This paper presents Portals, an interactive, stateful dataflow composition framework with strong end-to-end guarantees. Portals enables event-driven, resilient applications that span across dataflow graphs and serverless deployments. The demonstration exhibits three scenarios in our multi-dataflow streaming-based system: dynamically composing a stateful serverless application; an interactive cloud and edge serverless application; and a Portals browser playground. This work was partially funded by Digital Futures, the Swedish Foundation for Strategic Research under Grant No.: BD15-0006, as well as RISE AI.

Place, publisher, year, edition, pages
Association for Computing Machinery (ACM), 2023
National Category
Computer Sciences Software Engineering
Identifiers
urn:nbn:se:kth:diva-339286 (URN)10.14778/3611540.3611619 (DOI)001067701000080 ()2-s2.0-85174497385 (Scopus ID)
Conference
49th International Conference on Very Large Data Bases, VLDB 2023, Vancouver, Canada, Aug 28 2023 - Sep 1 2023
Note

QC 20231106

Available from: 2023-11-06 Created: 2023-11-06 Last updated: 2024-01-23Bibliographically approved
Ng, H., Wu, K. & Carbone, P. (2023). UniCache: Efficient Log Replication through Learning Workload Patterns. In: : . Paper presented at 26th International Conference on Extending Database Technology, EDBT 2023, Ioannina, Greece, Mar 28 2023 - Mar 31 2023 (pp. 471-477). OpenProceedings.org
Open this publication in new window or tab >>UniCache: Efficient Log Replication through Learning Workload Patterns
2023 (English)Conference paper, Published paper (Refereed)
Abstract [en]

Most of the world's cloud data service workloads are currently being backed by replicated state machines. Production-grade log replication protocols used for the job impose heavy data transfer duties on the primary server which need to disseminate the log commands to all the replica servers. UniCache proposes a principal solution to this problem using a learned replicated cache which enables commands to be sent over the network as compressed encodings. UniCache takes advantage of that each replica has access to a consistent prefix of the replicated log which allows them to build a uniform lookup cache used for compressing and decompressing commands consistently. UniCache achieves effective speedups, lowering the primary load in application workloads with a skewed data distribution. Our experimental studies showcase a low pre-processing overhead and the highest performance gains in cross-data center deployments over wide area networks.

Place, publisher, year, edition, pages
OpenProceedings.org, 2023
National Category
Computer Sciences
Identifiers
urn:nbn:se:kth:diva-334437 (URN)10.48786/edbt.2023.39 (DOI)2-s2.0-85165119241 (Scopus ID)
Conference
26th International Conference on Extending Database Technology, EDBT 2023, Ioannina, Greece, Mar 28 2023 - Mar 31 2023
Note

QC 20231123

Available from: 2023-08-21 Created: 2023-08-21 Last updated: 2023-11-23Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-9351-8508

Search in DiVA

Show all publications