TU Darmstadt / ULB / TUbiblio

DBPal: Weak Supervision for Learning a Natural Language Interface to Databases

Weir, Nathaniel and Crotty, Andrew and Galakatos, Alex and Ilkhechi, Amir and Ramaswamy, Shekar and Bhushan, Rohin and Cetintemel, Ugur and Utama, Prasetya and Geisler, Nadja and Hättasch, Benjamin and Eger, Steffen and Binnig, Carsten (2019):
DBPal: Weak Supervision for Learning a Natural Language Interface to Databases.
Los Angeles, California, USA, In: 1st International Workshop on Conversational Access to Data (CAST) in conj. with the 45th International Conference on Very Large Data Bases (VLDB), Los Angeles, California, USA, [Conference or Workshop Item]

Abstract

This paper describes DBPal, a new system to translate natural language utterances into SQL statements using a neural machine translation model. While other recent approaches use neural machine translation to implement a Natural Language Interface to Databases (NLIDB), existing techniques rely on supervised learning with manually curated training data, which results in substantial overhead for supporting each new database schema. In order to avoid this issue, DBPal implements a novel training pipeline based on weak supervision that synthesizes all training data from a given database schema. In our evaluation, we show that DBPal can outperform existing rule-based NLIDBs while achieving comparable performance to other NLIDBs that leverage deep neural network models without relying on manually curated training data for every new database schema.

Item Type: Conference or Workshop Item
Erschienen: 2019
Creators: Weir, Nathaniel and Crotty, Andrew and Galakatos, Alex and Ilkhechi, Amir and Ramaswamy, Shekar and Bhushan, Rohin and Cetintemel, Ugur and Utama, Prasetya and Geisler, Nadja and Hättasch, Benjamin and Eger, Steffen and Binnig, Carsten
Title: DBPal: Weak Supervision for Learning a Natural Language Interface to Databases
Language: English
Abstract:

This paper describes DBPal, a new system to translate natural language utterances into SQL statements using a neural machine translation model. While other recent approaches use neural machine translation to implement a Natural Language Interface to Databases (NLIDB), existing techniques rely on supervised learning with manually curated training data, which results in substantial overhead for supporting each new database schema. In order to avoid this issue, DBPal implements a novel training pipeline based on weak supervision that synthesizes all training data from a given database schema. In our evaluation, we show that DBPal can outperform existing rule-based NLIDBs while achieving comparable performance to other NLIDBs that leverage deep neural network models without relying on manually curated training data for every new database schema.

Place of Publication: Los Angeles, California, USA
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Data Management
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Research Training Group 1994 Adaptive Preparation of Information from Heterogeneous Sources
Event Title: 1st International Workshop on Conversational Access to Data (CAST) in conj. with the 45th International Conference on Very Large Data Bases (VLDB)
Event Location: Los Angeles, California, USA
Date Deposited: 24 Jul 2019 13:17
Export:

Optionen (nur für Redakteure)

View Item View Item