Urban, Matthias ; Nguyen, Duc Dat ; Binnig, Carsten
eds.: Bordawekar, Rajesh ; Shmueli, Oded ; Amsterdamer, Yael ; Firmani, Donatella ; Kipf, Andreas (2023)
OmniscientDB: A Large Language Model-Augmented DBMS That Knows What Other DBMSs Do Not Know.
6th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (aiDM'23). Seattle, USA (18.06.2023)
doi: 10.1145/3593078.3593933
Conference or Workshop Item, Bibliographie
Abstract
In this paper, we present our vision of OmniscientDB, a novel database that leverages the implicitly-stored knowledge in large language models to augment datasets for analytical queries or even machine learning tasks. OmiscientDB empowers its users to augment their datasets by means of simple SQL queries and thus has the potential to dramatically reduce the manual overhead associated with data integration. It uses automatic prompt engineering to construct appropriate prompts for given SQL queries and passes them to a large language model like GPT-3 to contribute additional data (i.e., new rows, columns, or entire tables), augmenting the explicitly stored data. Our initial evaluation demonstrates the general feasibility of our vision, explores different prompting techniques in greater detail, and points towards several directions for future research.
Item Type: | Conference or Workshop Item |
---|---|
Erschienen: | 2023 |
Editors: | Bordawekar, Rajesh ; Shmueli, Oded ; Amsterdamer, Yael ; Firmani, Donatella ; Kipf, Andreas |
Creators: | Urban, Matthias ; Nguyen, Duc Dat ; Binnig, Carsten |
Type of entry: | Bibliographie |
Title: | OmniscientDB: A Large Language Model-Augmented DBMS That Knows What Other DBMSs Do Not Know |
Language: | English |
Date: | 20 June 2023 |
Publisher: | ACM |
Book Title: | Proceedings of the Sixth International Workshop on Exploiting Artificial Intelligence Techniques for Data Management |
Event Title: | 6th International Workshop on Exploiting Artificial Intelligence Techniques for Data Management (aiDM'23) |
Event Location: | Seattle, USA |
Event Dates: | 18.06.2023 |
DOI: | 10.1145/3593078.3593933 |
Abstract: | In this paper, we present our vision of OmniscientDB, a novel database that leverages the implicitly-stored knowledge in large language models to augment datasets for analytical queries or even machine learning tasks. OmiscientDB empowers its users to augment their datasets by means of simple SQL queries and thus has the potential to dramatically reduce the manual overhead associated with data integration. It uses automatic prompt engineering to construct appropriate prompts for given SQL queries and passes them to a large language model like GPT-3 to contribute additional data (i.e., new rows, columns, or entire tables), augmenting the explicitly stored data. Our initial evaluation demonstrates the general feasibility of our vision, explores different prompting techniques in greater detail, and points towards several directions for future research. |
Additional Information: | Art.No.: 4 |
Divisions: | 20 Department of Computer Science 20 Department of Computer Science > Data and AI Systems |
Date Deposited: | 24 Jul 2023 13:03 |
Last Modified: | 25 Jul 2023 16:26 |
PPN: | 50991800X |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Send an inquiry |
Options (only for editors)
Show editorial Details |