TU Darmstadt / ULB / TUbiblio

Zero-Shot Cost Models for Parallel Stream Processing

Agnihotri, Pratyush ; Koldehofe, Boris ; Binnig, Carsten ; Luthra, Manisha (2023)
Zero-Shot Cost Models for Parallel Stream Processing.
6th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (aiDM 2023). Seattle, USA (18.06.2023-18.06.2023)
doi: 10.1145/3593078.3593934
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

This paper addresses the challenge of predicting the level of parallelism in distributed stream processing (DSP) systems, which are essential to deal with different high workload requirements of various industries such as e-commerce, online gaming, etc., where DSP systems are extensively used. Existing DSP systems rely on either manual tuning of parallelism degree or workload-driven learned models for tuning parallelism, which is either not efficient or can lead to costly operator migrations and downtime when there are workload drifts. Thus, we argue for a learned model that can autonomously decide on the right parallelism degree while generalizing across workloads and meeting the current demands of DSP applications. We propose a novel approach that leverages zero-shot cost models to predict parallelism degree while generalizing across unseen streaming workloads out-of-the-box. To reduce training effort, we propose a rule-based strategy that selects parallelism degree and meaningful transferable features related to query workload and hardware that influences the parallelism decisions. We demonstrate the effectiveness of our strategy by evaluating it with different amount of training queries and show that it achieves lower costs for parallel continuous query processing.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2023
Autor(en): Agnihotri, Pratyush ; Koldehofe, Boris ; Binnig, Carsten ; Luthra, Manisha
Art des Eintrags: Bibliographie
Titel: Zero-Shot Cost Models for Parallel Stream Processing
Sprache: Englisch
Publikationsjahr: 20 Juni 2023
Verlag: ACM
Buchtitel: Proceedings of the Sixth International Workshop on Exploiting Artificial Intelligence Techniques for Data Management
Veranstaltungstitel: 6th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (aiDM 2023)
Veranstaltungsort: Seattle, USA
Veranstaltungsdatum: 18.06.2023-18.06.2023
DOI: 10.1145/3593078.3593934
Kurzbeschreibung (Abstract):

This paper addresses the challenge of predicting the level of parallelism in distributed stream processing (DSP) systems, which are essential to deal with different high workload requirements of various industries such as e-commerce, online gaming, etc., where DSP systems are extensively used. Existing DSP systems rely on either manual tuning of parallelism degree or workload-driven learned models for tuning parallelism, which is either not efficient or can lead to costly operator migrations and downtime when there are workload drifts. Thus, we argue for a learned model that can autonomously decide on the right parallelism degree while generalizing across workloads and meeting the current demands of DSP applications. We propose a novel approach that leverages zero-shot cost models to predict parallelism degree while generalizing across unseen streaming workloads out-of-the-box. To reduce training effort, we propose a rule-based strategy that selects parallelism degree and meaningful transferable features related to query workload and hardware that influences the parallelism decisions. We demonstrate the effectiveness of our strategy by evaluating it with different amount of training queries and show that it achieves lower costs for parallel continuous query processing.

Freie Schlagworte: parallelism prediction, distributed stream processing, zero-shot cost models
Fachbereich(e)/-gebiet(e): 18 Fachbereich Elektrotechnik und Informationstechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Datentechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Datentechnik > Multimedia Kommunikation
20 Fachbereich Informatik
20 Fachbereich Informatik > Data and AI Systems
DFG-Sonderforschungsbereiche (inkl. Transregio)
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen > Teilprojekt C2: Informationszentrische Sicht
Hinterlegungsdatum: 28 Nov 2023 13:30
Letzte Änderung: 30 Jan 2024 10:22
PPN: 515130648
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen