TU Darmstadt / ULB / TUbiblio

The Case for Multi-Task Zero-Shot Learning for Databases

Wehrstein, Johannes ; Hilprecht, Benjamin ; Olt, Benjamin ; Luthra, Manisha ; Binnig, Carsten (2022)
The Case for Multi-Task Zero-Shot Learning for Databases.
4th International Workshop on Applied AI for Database Systems and Applications. Sydney, Australia (05.09.2022)
Conference or Workshop Item, Bibliographie

Abstract

Recently, machine learning has successfully been applied to many database problems such as query optimization, physical design tuning, or cardinality estimation. However, the predominant paradigm to design such learned database components is workload-driven learning, where a representative workload has to be executed on the database to gather training data. This costly procedure has to be repeated for every new database a model should be trained on. Hence, recently it was suggested to train zero-shot cost models that are pretrained once and can generalize to unseen databases out-of-the-box. While the results for the task of cost estimation are promising, it is unclear how to generalize this approach to additional tasks beyond query latency prediction. Hence, in this paper, we propose several directions to generalize zero-shot cost models to other tasks and validate our approaches in two case studies.

Item Type: Conference or Workshop Item
Erschienen: 2022
Creators: Wehrstein, Johannes ; Hilprecht, Benjamin ; Olt, Benjamin ; Luthra, Manisha ; Binnig, Carsten
Type of entry: Bibliographie
Title: The Case for Multi-Task Zero-Shot Learning for Databases
Language: English
Date: 5 September 2022
Journal or Publication Title: AIDB2022
Event Title: 4th International Workshop on Applied AI for Database Systems and Applications
Event Location: Sydney, Australia
Event Dates: 05.09.2022
Corresponding Links:
Abstract:

Recently, machine learning has successfully been applied to many database problems such as query optimization, physical design tuning, or cardinality estimation. However, the predominant paradigm to design such learned database components is workload-driven learning, where a representative workload has to be executed on the database to gather training data. This costly procedure has to be repeated for every new database a model should be trained on. Hence, recently it was suggested to train zero-shot cost models that are pretrained once and can generalize to unseen databases out-of-the-box. While the results for the task of cost estimation are promising, it is unclear how to generalize this approach to additional tasks beyond query latency prediction. Hence, in this paper, we propose several directions to generalize zero-shot cost models to other tasks and validate our approaches in two case studies.

Uncontrolled Keywords: systems_maki, systems_funding_52115350, databases, ml, learned database components, zero-shot learning, MV selection, ML4DB
Additional Information:

Held with VLDB 2022

Divisions: 20 Department of Computer Science
20 Department of Computer Science > Data and AI Systems
TU-Projects: DFG|SFB1053|SFB1053 TPZ Steinmet
Date Deposited: 04 Apr 2023 12:48
Last Modified: 04 Apr 2023 12:48
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Send an inquiry Send an inquiry

Options (only for editors)
Show editorial Details Show editorial Details