TU Darmstadt / ULB / TUbiblio

Zero-shot Cost Models for Distributed Stream Processing

Heinrich, Roman ; Luthra, Manisha ; Kornmayer, Harald ; Binnig, Carsten (2022)
Zero-shot Cost Models for Distributed Stream Processing.
16th ACM International Conference on Distributed and Event-Based Systems. Copenhagen, Denmark (27.-30.06.2022)
doi: 10.1145/3524860.3539639
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

This paper proposes a learned cost estimation model for Distributed Stream Processing Systems (DSPS) with an aim to provide accurate cost predictions of executing queries. A major premise of this work is that the proposed learned model can generalize to the dynamics of streaming workloads out-of-the-box. This means a model once trained can accurately predict performance metrics such as latency and throughput even if the characteristics of the data and workload or the deployment of operators to hardware changes at runtime. That way the model can be used to solve tasks such as optimizing the placement of operators to minimize the end-to-end latency of a streaming query or maximize its throughput even under varying conditions. Our evaluation on a well-known DSPS, Apache Storm, shows that the model can predict accurately for unseen workloads and queries while generalizing across real-world benchmarks.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2022
Autor(en): Heinrich, Roman ; Luthra, Manisha ; Kornmayer, Harald ; Binnig, Carsten
Art des Eintrags: Bibliographie
Titel: Zero-shot Cost Models for Distributed Stream Processing
Sprache: Englisch
Publikationsjahr: 15 Juli 2022
Verlag: ACM
Buchtitel: DEBS 2022: Proceedings of the 16th ACM International Conference on Distributed and Event-Based Systems
Veranstaltungstitel: 16th ACM International Conference on Distributed and Event-Based Systems
Veranstaltungsort: Copenhagen, Denmark
Veranstaltungsdatum: 27.-30.06.2022
DOI: 10.1145/3524860.3539639
URL / URN: https://dl.acm.org/doi/10.1145/3524860.3539639
Kurzbeschreibung (Abstract):

This paper proposes a learned cost estimation model for Distributed Stream Processing Systems (DSPS) with an aim to provide accurate cost predictions of executing queries. A major premise of this work is that the proposed learned model can generalize to the dynamics of streaming workloads out-of-the-box. This means a model once trained can accurately predict performance metrics such as latency and throughput even if the characteristics of the data and workload or the deployment of operators to hardware changes at runtime. That way the model can be used to solve tasks such as optimizing the placement of operators to minimize the end-to-end latency of a streaming query or maximize its throughput even under varying conditions. Our evaluation on a well-known DSPS, Apache Storm, shows that the model can predict accurately for unseen workloads and queries while generalizing across real-world benchmarks.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Data Management (2022 umbenannt in Data and AI Systems)
DFG-Sonderforschungsbereiche (inkl. Transregio)
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen > Teilprojekt C2: Informationszentrische Sicht
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > D: Technologie
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > D: Technologie > Teilprojekt D2: Data-Center-Technologie
Hinterlegungsdatum: 16 Aug 2022 08:07
Letzte Änderung: 17 Nov 2022 12:10
PPN: 501733701
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen