Agnihotri, Pratyush ; Koldehofe, Boris ; Stiegele, Paul ; Heinrich, Roman ; Binnig, Carsten ; Luthra, Manisha (2024)
ZeroTune: Learned Zero-Shot Cost Models for Parallelism Tuning in Stream Processing.
40th IEEE International Conference on Data Engineering (ICDE 2024). Utrecht, Netherlands (13.05.2024 - 17.05.2024)
doi: 10.1109/ICDE60146.2024.00163
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
This paper introduces ZeroTune, a novel cost model for parallel and distributed stream processing that can be used to effectively set initial parallelism degrees of streaming queries. Unlike existing models, which rely majorly on online learning statistics that are non-transferable, context-specific, and require extensive training, ZeroTune proposes data-efficient zero-shot learning techniques that enable very accurate cost predictions without having observed any query deployment. To overcome these challenges, we propose ZeroTune, a graph neural network architecture that can learn from the structural complexity of parallel distributed stream processing systems, enabling them to adapt to unseen workloads and hardware configurations. In our experiments, we show when integrating ZeroTune in a distributed streaming system such as Apache Flink, we can accurately set the degree of parallelism, showing an average speed-up of around 5× in comparison to existing approaches.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2024 |
Autor(en): | Agnihotri, Pratyush ; Koldehofe, Boris ; Stiegele, Paul ; Heinrich, Roman ; Binnig, Carsten ; Luthra, Manisha |
Art des Eintrags: | Bibliographie |
Titel: | ZeroTune: Learned Zero-Shot Cost Models for Parallelism Tuning in Stream Processing |
Sprache: | Englisch |
Publikationsjahr: | 23 Juli 2024 |
Verlag: | IEEE |
Buchtitel: | Proceedings: 2024 IEEE 40th International Conference on Data Engineering |
Veranstaltungstitel: | 40th IEEE International Conference on Data Engineering (ICDE 2024) |
Veranstaltungsort: | Utrecht, Netherlands |
Veranstaltungsdatum: | 13.05.2024 - 17.05.2024 |
DOI: | 10.1109/ICDE60146.2024.00163 |
Kurzbeschreibung (Abstract): | This paper introduces ZeroTune, a novel cost model for parallel and distributed stream processing that can be used to effectively set initial parallelism degrees of streaming queries. Unlike existing models, which rely majorly on online learning statistics that are non-transferable, context-specific, and require extensive training, ZeroTune proposes data-efficient zero-shot learning techniques that enable very accurate cost predictions without having observed any query deployment. To overcome these challenges, we propose ZeroTune, a graph neural network architecture that can learn from the structural complexity of parallel distributed stream processing systems, enabling them to adapt to unseen workloads and hardware configurations. In our experiments, we show when integrating ZeroTune in a distributed streaming system such as Apache Flink, we can accurately set the degree of parallelism, showing an average speed-up of around 5× in comparison to existing approaches. |
Freie Schlagworte: | Training, Costs, Zero-shot learning, Parallel processing, Predictive models, Data engineering, Data models, Zero-shot cost models, Parallelism tuning |
Fachbereich(e)/-gebiet(e): | 18 Fachbereich Elektrotechnik und Informationstechnik 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Datentechnik 18 Fachbereich Elektrotechnik und Informationstechnik > Institut für Datentechnik > Multimedia Kommunikation 20 Fachbereich Informatik 20 Fachbereich Informatik > Data and AI Systems DFG-Sonderforschungsbereiche (inkl. Transregio) DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen > Teilprojekt C2: Informationszentrische Sicht |
Hinterlegungsdatum: | 06 Sep 2024 09:13 |
Letzte Änderung: | 21 Okt 2024 11:20 |
PPN: | 522359175 |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |