TU Darmstadt / ULB / TUbiblio

Exploiting Debate Portals for Semi-supervised Argumentation Mining in User-Generated Web Discourse

Habernal, Ivan ; Gurevych, Iryna (2015)
Exploiting Debate Portals for Semi-supervised Argumentation Mining in User-Generated Web Discourse.
Lisbon, Portugal
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Analyzing arguments in user-generated Web discourse has recently gained attention in argumentation mining, an evolving field of NLP. Current approaches, which employ fully-supervised machine learning, are usually domain dependent and suffer from the lack of large and diverse annotated corpora. However, annotating arguments in discourse is costly, error-prone, and highly context-dependent. We asked whether leveraging unlabeled data in a semi-supervised manner can boost the performance of argument component identification and to which extent is the approach independent of domain and register. We propose novel features that exploit clustering of unlabeled data from debate portals based on a word embeddings representation. Using these features, we significantly outperform several baselines in the cross-validation, cross-domain, and cross-register evaluation scenarios.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2015
Autor(en): Habernal, Ivan ; Gurevych, Iryna
Art des Eintrags: Bibliographie
Titel: Exploiting Debate Portals for Semi-supervised Argumentation Mining in User-Generated Web Discourse
Sprache: Englisch
Publikationsjahr: September 2015
Verlag: Association for Computational Linguistics
Buchtitel: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP)
Veranstaltungsort: Lisbon, Portugal
URL / URN: http://www.aclweb.org/anthology/D15-1255
Zugehörige Links:
Kurzbeschreibung (Abstract):

Analyzing arguments in user-generated Web discourse has recently gained attention in argumentation mining, an evolving field of NLP. Current approaches, which employ fully-supervised machine learning, are usually domain dependent and suffer from the lack of large and diverse annotated corpora. However, annotating arguments in discourse is costly, error-prone, and highly context-dependent. We asked whether leveraging unlabeled data in a semi-supervised manner can boost the performance of argument component identification and to which extent is the approach independent of domain and register. We propose novel features that exploit clustering of unlabeled data from debate portals based on a word embeddings representation. Using these features, we significantly outperform several baselines in the cross-validation, cross-domain, and cross-register evaluation scenarios.

Freie Schlagworte: UKP_a_ArMin;UKP_reviewed;argumentation mining
ID-Nummer: TUD-CS-2015-1178
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Graduiertenkolleg 1994 Adaptive Informationsaufbereitung aus heterogenen Quellen
Hinterlegungsdatum: 31 Dez 2016 14:29
Letzte Änderung: 24 Jan 2020 12:03
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen