TU Darmstadt / ULB / TUbiblio

Automatic Analysis of Arguments about Controversial Educational Topics in Web Documents

Kluge, Roland (2014):
Automatic Analysis of Arguments about Controversial Educational Topics in Web Documents.
Technische Universität Darmstadt, [Online-Edition: https://download.hrz.tu-darmstadt.de/media/FB20/Dekanat/Publ...],
[Master Thesis]

Abstract

Decision making in social communities, such as families, companies, or parties, builds on debates and discussions, where arguments on particular topics are exchanged. With this work, we contribute to the efforts in automatically processing arguments for decision making, which is embedded in the field of Argumentation Mining. Since only few corpora for Argumentation Mining exist, we first built a corpus of argumentative German Web documents, containing 79 documents from 7 educational topics, which were annotated by 3 annotators according to the claim-premise argumentation model. The corpus comprises 70,000 tokens, annotated with 5,000 argument units, i.e., claims and premises. We found that the annotators performed similarly with regard to surface statistics such as the distribution of argument unit types or lengths. Each annotator’s annotations cover on average ca. 74 of the tokens, which indicates the argumentative nature of the dataset. The inter-annotator agreement evaluates to ca. 44% for Fleiss’ and to ca. 40% for Krippendorff’s unitized alpha. We found that agreement correlates slightly negatively with annotation time demand per document. Finally, we present a number of experiments on the role of 360 discourse markers for discriminating claims from premises. Our results show that several intensifying discourse particles are distinctive for claims and premises. Furthermore, we confirmed expectations from the literature that the discourse relation concession introduce counter-arguments. The discourse relations comparison/contrast and result frequently indicate claims, while the discourse relations alternative, reason, and sequence tend to indicate premises. Another experiment investigated the role of discourse markers as features for Machine Learning. Using a Naïve Bayes classifier, we found that discourse markers as sole features for discriminating claims and premises yielded an improvement of 13 percentage points over the majority class baseline.

Item Type: Master Thesis
Erschienen: 2014
Creators: Kluge, Roland
Title: Automatic Analysis of Arguments about Controversial Educational Topics in Web Documents
Language: English
Abstract:

Decision making in social communities, such as families, companies, or parties, builds on debates and discussions, where arguments on particular topics are exchanged. With this work, we contribute to the efforts in automatically processing arguments for decision making, which is embedded in the field of Argumentation Mining. Since only few corpora for Argumentation Mining exist, we first built a corpus of argumentative German Web documents, containing 79 documents from 7 educational topics, which were annotated by 3 annotators according to the claim-premise argumentation model. The corpus comprises 70,000 tokens, annotated with 5,000 argument units, i.e., claims and premises. We found that the annotators performed similarly with regard to surface statistics such as the distribution of argument unit types or lengths. Each annotator’s annotations cover on average ca. 74 of the tokens, which indicates the argumentative nature of the dataset. The inter-annotator agreement evaluates to ca. 44% for Fleiss’ and to ca. 40% for Krippendorff’s unitized alpha. We found that agreement correlates slightly negatively with annotation time demand per document. Finally, we present a number of experiments on the role of 360 discourse markers for discriminating claims from premises. Our results show that several intensifying discourse particles are distinctive for claims and premises. Furthermore, we confirmed expectations from the literature that the discourse relation concession introduce counter-arguments. The discourse relations comparison/contrast and result frequently indicate claims, while the discourse relations alternative, reason, and sequence tend to indicate premises. Another experiment investigated the role of discourse markers as features for Machine Learning. Using a Naïve Bayes classifier, we found that discourse markers as sole features for discriminating claims and premises yielded an improvement of 13 percentage points over the majority class baseline.

Uncontrolled Keywords: Knowledge Discovery in Scientific Literature;UKP_a_ENLP;UKP_a_WALL
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
Date Deposited: 31 Dec 2016 14:29
Official URL: https://download.hrz.tu-darmstadt.de/media/FB20/Dekanat/Publ...
Identification Number: TUD-CS-2014-0080
Referees: Eckle-Kohler, Judith and Gurevych, Iryna
Export:
Suche nach Titel in: TUfind oder in Google

Optionen (nur für Redakteure)

View Item View Item