TU Darmstadt / ULB / TUbiblio

Automatic recognition of German news focusing on future-directed beliefs and intentions

Eckle-Kohler, Judith and Kohler, Michael and Mehnert, Jens (2008):
Automatic recognition of German news focusing on future-directed beliefs and intentions.
In: Computer Speech and Language, pp. 394-414, 22, (4), [Online-Edition: https://www.sciencedirect.com/science/article/pii/S088523080...],
[Article]

Abstract

We consider the classification of German news stories as either focusing on future-directed beliefs and intentions or lacking these. The method proposed in this article requires only a small set of labeled training data. Rather, we introduce German clues for the automatic identification of future-orientation which are used for automatic labeling of Reuters news stories. We describe the development of a high-precision procedure for automatic labeling in a bootstrapping fashion: A first version of the labeling procedure uses the absence of clues for future-directedness as indicator for non-future-directedness and is able to automatically label about one-third of the Reuters news stories with high precision. Then a perceptron is applied to the automatically labeled news stories in order to semi-automatically acquire an additional set of clues for non-future-directedness. The second version of the labeling procedure additionally uses these clues and achieves remarkably improved results in terms of recall; it can even be extended by a guessing step to perform classification with an error of 22.5%. We also investigate another way to increase the recall by using the automatically labeled news stories as training data for statistical classifiers. Three different types of statistical classifiers are applied in order to address the question, which classifier is most suited for the text classification task considered. The best statistical classifier combined with the results of improved automatic labeling is able to recognize the two classes of news stories with an error of 19%.

Item Type: Article
Erschienen: 2008
Creators: Eckle-Kohler, Judith and Kohler, Michael and Mehnert, Jens
Title: Automatic recognition of German news focusing on future-directed beliefs and intentions
Language: English
Abstract:

We consider the classification of German news stories as either focusing on future-directed beliefs and intentions or lacking these. The method proposed in this article requires only a small set of labeled training data. Rather, we introduce German clues for the automatic identification of future-orientation which are used for automatic labeling of Reuters news stories. We describe the development of a high-precision procedure for automatic labeling in a bootstrapping fashion: A first version of the labeling procedure uses the absence of clues for future-directedness as indicator for non-future-directedness and is able to automatically label about one-third of the Reuters news stories with high precision. Then a perceptron is applied to the automatically labeled news stories in order to semi-automatically acquire an additional set of clues for non-future-directedness. The second version of the labeling procedure additionally uses these clues and achieves remarkably improved results in terms of recall; it can even be extended by a guessing step to perform classification with an error of 22.5%. We also investigate another way to increase the recall by using the automatically labeled news stories as training data for statistical classifiers. Three different types of statistical classifiers are applied in order to address the question, which classifier is most suited for the text classification task considered. The best statistical classifier combined with the results of improved automatic labeling is able to recognize the two classes of news stories with an error of 19%.

Journal or Publication Title: Computer Speech and Language
Volume: 22
Number: 4
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Ubiquitous Knowledge Processing
Date Deposited: 31 Dec 2016 14:29
Official URL: https://www.sciencedirect.com/science/article/pii/S088523080...
Identification Number: TUD-CS-2008-11509
Related URLs:
Export:
Suche nach Titel in: TUfind oder in Google

Optionen (nur für Redakteure)

View Item View Item