TU Darmstadt / ULB / TUbiblio

DeepDB: Learn from Data, Not from Queries!

Hilprecht, Benjamin ; Schmidt, Andreas ; Kulessa, Moritz ; Molina, Alejandro ; Kersting, Kristian ; Binnig, Carsten (2020)
DeepDB: Learn from Data, Not from Queries!
In: Proceedings of the VLDB Endowment, 13 (7)
doi: 10.14778/3384345.3384349
Artikel, Bibliographie

Kurzbeschreibung (Abstract)

The typical approach for learned DBMS components is to capture the behavior by running a representative set of quer- ies and use the observations to train a machine learning model. This workload-driven approach, however, has two major downsides. First, collecting the training data can be very expensive, since all queries need to be executed on potentially large databases. Second, training data has to be recollected when the workload or the database changes. To overcome these limitations, we take a different route and propose a new data-driven approach for learned DBMS com- ponents which directly supports changes of the workload and data without the need of retraining. Indeed, one may now expect that this comes at a price of lower accuracy since workload-driven approaches can make use of more in- formation. However, this is not the case. The results of our empirical evaluation demonstrate that our data-driven approach not only provides better accuracy than state-of- the-art learned components but also generalizes better to unseen queries.

Typ des Eintrags: Artikel
Erschienen: 2020
Autor(en): Hilprecht, Benjamin ; Schmidt, Andreas ; Kulessa, Moritz ; Molina, Alejandro ; Kersting, Kristian ; Binnig, Carsten
Art des Eintrags: Bibliographie
Titel: DeepDB: Learn from Data, Not from Queries!
Sprache: Englisch
Publikationsjahr: 26 März 2020
Verlag: ACM
Titel der Zeitschrift, Zeitung oder Schriftenreihe: Proceedings of the VLDB Endowment
Jahrgang/Volume einer Zeitschrift: 13
(Heft-)Nummer: 7
DOI: 10.14778/3384345.3384349
Kurzbeschreibung (Abstract):

The typical approach for learned DBMS components is to capture the behavior by running a representative set of quer- ies and use the observations to train a machine learning model. This workload-driven approach, however, has two major downsides. First, collecting the training data can be very expensive, since all queries need to be executed on potentially large databases. Second, training data has to be recollected when the workload or the database changes. To overcome these limitations, we take a different route and propose a new data-driven approach for learned DBMS com- ponents which directly supports changes of the workload and data without the need of retraining. Indeed, one may now expect that this comes at a price of lower accuracy since workload-driven approaches can make use of more in- formation. However, this is not the case. The results of our empirical evaluation demonstrate that our data-driven approach not only provides better accuracy than state-of- the-art learned components but also generalizes better to unseen queries.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Data Management (2022 umbenannt in Data and AI Systems)
DFG-Sonderforschungsbereiche (inkl. Transregio)
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen
DFG-Sonderforschungsbereiche (inkl. Transregio) > Sonderforschungsbereiche > SFB 1053: MAKI – Multi-Mechanismen-Adaption für das künftige Internet > C: Kommunikationsmechanismen > Teilprojekt C2: Informationszentrische Sicht
Hinterlegungsdatum: 21 Apr 2022 08:49
Letzte Änderung: 21 Apr 2022 08:49
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen