Hättasch, Benjamin ; Bodensohn, Jan-Micha ; Binnig, Carsten (2022)
Demonstrating ASET: Ad-Hoc Structured Exploration of Text Collections.
2022 International Conference on Management of Data. Philadelphia, USA (12.06.2022-17.06.2022)
doi: 10.1145/3514221.3520174
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
In this demo, we present ASET, a novel tool to explore the contents of unstructured data (text) by automatically transforming relevant parts into tabular form. ASET works in an ad-hoc manner without the need to curate extraction pipelines for the (unseen) text collection or to annotate large amounts of training data. The main idea is to use a new two-phased approach that first extracts a superset of information nuggets from the texts using existing extractors such as named entity recognizers. In a second step, it leverages embeddings and a novel matching strategy to match the extractions to a structured table definition as requested by the user. This demo features the ASET system with a graphical user interface that allows people without machine learning or programming expertise to explore text collections efficiently. This can be done in a self-directed and flexible manner, and ASET provides an intuitive impression of the result quality.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2022 |
Autor(en): | Hättasch, Benjamin ; Bodensohn, Jan-Micha ; Binnig, Carsten |
Art des Eintrags: | Bibliographie |
Titel: | Demonstrating ASET: Ad-Hoc Structured Exploration of Text Collections |
Sprache: | Englisch |
Publikationsjahr: | Juli 2022 |
Verlag: | ACM |
Buchtitel: | SIGMOD'22: Proceedings of the 2022 International Conference on Management of Data |
Veranstaltungstitel: | 2022 International Conference on Management of Data |
Veranstaltungsort: | Philadelphia, USA |
Veranstaltungsdatum: | 12.06.2022-17.06.2022 |
DOI: | 10.1145/3514221.3520174 |
Kurzbeschreibung (Abstract): | In this demo, we present ASET, a novel tool to explore the contents of unstructured data (text) by automatically transforming relevant parts into tabular form. ASET works in an ad-hoc manner without the need to curate extraction pipelines for the (unseen) text collection or to annotate large amounts of training data. The main idea is to use a new two-phased approach that first extracts a superset of information nuggets from the texts using existing extractors such as named entity recognizers. In a second step, it leverages embeddings and a novel matching strategy to match the extractions to a structured table definition as requested by the user. This demo features the ASET system with a graphical user interface that allows people without machine learning or programming expertise to explore text collections efficiently. This can be done in a self-directed and flexible manner, and ASET provides an intuitive impression of the result quality. |
Freie Schlagworte: | systems_aset, systems_wannadb, systems_intexplore, text to table, interactive text exploration, matching embeddings |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Data and AI Systems |
Hinterlegungsdatum: | 06 Jun 2023 12:37 |
Letzte Änderung: | 02 Aug 2023 13:36 |
PPN: | 510088619 |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |