TU Darmstadt / ULB / TUbiblio

ZeroTune: Learned Zero-Shot Cost Models for Parallelism Tuning in Stream Processing

Agnihotri, Pratyush ; Koldehofe, Boris ; Stiegele, Paul ; Heinrich, Roman ; Binnig, Carsten ; Luthra, Manisha (2024)
ZeroTune: Learned Zero-Shot Cost Models for Parallelism Tuning in Stream Processing.
40th IEEE International Conference on Data Engineering (ICDE 2024). Utrecht, Netherlands (13.05.2024 - 17.05.2024)
doi: 10.1109/ICDE60146.2024.00163
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

This paper introduces ZeroTune, a novel cost model for parallel and distributed stream processing that can be used to effectively set initial parallelism degrees of streaming queries. Unlike existing models, which rely majorly on online learning statistics that are non-transferable, context-specific, and require extensive training, ZeroTune proposes data-efficient zero-shot learning techniques that enable very accurate cost predictions without having observed any query deployment. To overcome these challenges, we propose ZeroTune, a graph neural network architecture that can learn from the structural complexity of parallel distributed stream processing systems, enabling them to adapt to unseen workloads and hardware configurations. In our experiments, we show when integrating ZeroTune in a distributed streaming system such as Apache Flink, we can accurately set the degree of parallelism, showing an average speed-up of around 5× in comparison to existing approaches.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2024
Autor(en): Agnihotri, Pratyush ; Koldehofe, Boris ; Stiegele, Paul ; Heinrich, Roman ; Binnig, Carsten ; Luthra, Manisha
Art des Eintrags: Bibliographie
Titel: ZeroTune: Learned Zero-Shot Cost Models for Parallelism Tuning in Stream Processing
Sprache: Englisch
Publikationsjahr: 23 Juli 2024
Verlag: IEEE
Buchtitel: Proceedings: 2024 IEEE 40th International Conference on Data Engineering
Veranstaltungstitel: 40th IEEE International Conference on Data Engineering (ICDE 2024)
Veranstaltungsort: Utrecht, Netherlands
Veranstaltungsdatum: 13.05.2024 - 17.05.2024
DOI: 10.1109/ICDE60146.2024.00163
Kurzbeschreibung (Abstract):

This paper introduces ZeroTune, a novel cost model for parallel and distributed stream processing that can be used to effectively set initial parallelism degrees of streaming queries. Unlike existing models, which rely majorly on online learning statistics that are non-transferable, context-specific, and require extensive training, ZeroTune proposes data-efficient zero-shot learning techniques that enable very accurate cost predictions without having observed any query deployment. To overcome these challenges, we propose ZeroTune, a graph neural network architecture that can learn from the structural complexity of parallel distributed stream processing systems, enabling them to adapt to unseen workloads and hardware configurations. In our experiments, we show when integrating ZeroTune in a distributed streaming system such as Apache Flink, we can accurately set the degree of parallelism, showing an average speed-up of around 5× in comparison to existing approaches.

Freie Schlagworte: Training, Costs, Zero-shot learning, Parallel processing, Predictive models, Data engineering, Data models, Zero-shot cost models, Parallelism tuning
Fachbereich(e)/-gebiet(e): 18 Fachbereich Elektrotechnik und Informationstechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Datentechnik
18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Datentechnik > Multimedia Kommunikation
20 Fachbereich Informatik
20 Fachbereich Informatik > Data and AI Systems
DFG-Sonderforschungsbereiche (inkl. Transregio)
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen > Teilprojekt C2: Informationszentrische Sicht
Hinterlegungsdatum: 06 Sep 2024 09:13
Letzte Änderung: 21 Okt 2024 11:20
PPN: 522359175
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen