TU Darmstadt / ULB / TUbiblio

In-tool Learning for Selective Manual Annotation in Large Corpora

Do Dinh, Erik-Lân ; Eckart de Castilho, Richard ; Gurevych, Iryna (2015)
In-tool Learning for Selective Manual Annotation in Large Corpora.
Beijing, China
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

We present a novel approach to the selective annotation of large corpora through the use of machine learning. Linguistic search engines used to locate potential instances of an infrequent phenomenon do not support ranking of the search results. This favors the use of high-precision queries that return only a few results over broader queries that have a higher recall. Our approach introduces a classifier used to rank the search results and thus helping the annotator focus on those results with the highest potential of being an instance of the phenomenon in question, even in low-precision queries. The classifier is trained in an in-tool fashion, except for preprocessing relying only on the manual annotations done by the users in the querying tool itself. To implement this approach, we build upon an existing web-based multi-user search and annotation tool.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2015
Autor(en): Do Dinh, Erik-Lân ; Eckart de Castilho, Richard ; Gurevych, Iryna
Art des Eintrags: Bibliographie
Titel: In-tool Learning for Selective Manual Annotation in Large Corpora
Sprache: Englisch
Publikationsjahr: Juli 2015
Verlag: Association for Computational Linguistics and The Asian Federation of Natural Language Processing
Buchtitel: Proceedings of ACL-IJCNLP 2015 System Demonstrations
Veranstaltungsort: Beijing, China
URL / URN: http://www.aclweb.org/anthology/P15-4003
Zugehörige Links:
Kurzbeschreibung (Abstract):

We present a novel approach to the selective annotation of large corpora through the use of machine learning. Linguistic search engines used to locate potential instances of an infrequent phenomenon do not support ranking of the search results. This favors the use of high-precision queries that return only a few results over broader queries that have a higher recall. Our approach introduces a classifier used to rank the search results and thus helping the annotator focus on those results with the highest potential of being an instance of the phenomenon in question, even in low-precision queries. The classifier is trained in an in-tool fashion, except for preprocessing relying only on the manual annotations done by the users in the querying tool itself. To implement this approach, we build upon an existing web-based multi-user search and annotation tool.

Freie Schlagworte: Knowledge Discovery in Scientific Literature;UKP_a_LangTech4eHum;UKP_s_CSniper;UKP_reviewed
ID-Nummer: TUD-CS-2015-0098
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Graduiertenkolleg 1994 Adaptive Informationsaufbereitung aus heterogenen Quellen
Hinterlegungsdatum: 31 Dez 2016 14:29
Letzte Änderung: 24 Jan 2020 12:03
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen