TU Darmstadt / ULB / TUbiblio

Zero-shot Cost Models for Distributed Stream Processing

Heinrich, Roman ; Luthra, Manisha ; Kornmayer, Harald ; Binnig, Carsten (2022)
Zero-shot Cost Models for Distributed Stream Processing.
16th ACM International Conference on Distributed and Event-Based Systems. Copenhagen, Denmark (27.-30.06.2022)
doi: 10.1145/3524860.3539639
Conference or Workshop Item, Bibliographie

Abstract

This paper proposes a learned cost estimation model for Distributed Stream Processing Systems (DSPS) with an aim to provide accurate cost predictions of executing queries. A major premise of this work is that the proposed learned model can generalize to the dynamics of streaming workloads out-of-the-box. This means a model once trained can accurately predict performance metrics such as latency and throughput even if the characteristics of the data and workload or the deployment of operators to hardware changes at runtime. That way the model can be used to solve tasks such as optimizing the placement of operators to minimize the end-to-end latency of a streaming query or maximize its throughput even under varying conditions. Our evaluation on a well-known DSPS, Apache Storm, shows that the model can predict accurately for unseen workloads and queries while generalizing across real-world benchmarks.

Item Type: Conference or Workshop Item
Erschienen: 2022
Creators: Heinrich, Roman ; Luthra, Manisha ; Kornmayer, Harald ; Binnig, Carsten
Type of entry: Bibliographie
Title: Zero-shot Cost Models for Distributed Stream Processing
Language: English
Date: 15 July 2022
Publisher: ACM
Book Title: DEBS 2022: Proceedings of the 16th ACM International Conference on Distributed and Event-Based Systems
Event Title: 16th ACM International Conference on Distributed and Event-Based Systems
Event Location: Copenhagen, Denmark
Event Dates: 27.-30.06.2022
DOI: 10.1145/3524860.3539639
URL / URN: https://dl.acm.org/doi/10.1145/3524860.3539639
Abstract:

This paper proposes a learned cost estimation model for Distributed Stream Processing Systems (DSPS) with an aim to provide accurate cost predictions of executing queries. A major premise of this work is that the proposed learned model can generalize to the dynamics of streaming workloads out-of-the-box. This means a model once trained can accurately predict performance metrics such as latency and throughput even if the characteristics of the data and workload or the deployment of operators to hardware changes at runtime. That way the model can be used to solve tasks such as optimizing the placement of operators to minimize the end-to-end latency of a streaming query or maximize its throughput even under varying conditions. Our evaluation on a well-known DSPS, Apache Storm, shows that the model can predict accurately for unseen workloads and queries while generalizing across real-world benchmarks.

Divisions: 20 Department of Computer Science
20 Department of Computer Science > Data Management (2022 umbenannt in Data and AI Systems)
DFG-Collaborative Research Centres (incl. Transregio)
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres > CRC 1053: MAKI – Multi-Mechanisms Adaptation for the Future Internet
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres > CRC 1053: MAKI – Multi-Mechanisms Adaptation for the Future Internet > C: Communication Mechanisms
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres > CRC 1053: MAKI – Multi-Mechanisms Adaptation for the Future Internet > C: Communication Mechanisms > Subproject C2: Information-centred perspective
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres > CRC 1053: MAKI – Multi-Mechanisms Adaptation for the Future Internet > D: Technology
DFG-Collaborative Research Centres (incl. Transregio) > Collaborative Research Centres > CRC 1053: MAKI – Multi-Mechanisms Adaptation for the Future Internet > D: Technology > Subproject D2: Data-Center-Technology
Date Deposited: 16 Aug 2022 08:07
Last Modified: 17 Nov 2022 12:10
PPN: 501733701
Export:
Suche nach Titel in: TUfind oder in Google
Send an inquiry Send an inquiry

Options (only for editors)
Show editorial Details Show editorial Details