TU Darmstadt / ULB / TUbiblio

Web Corpus Mining by Instance of Wikipedia

Gleim, Rüdiger and Mehler, Alexander and Dehmer, Matthias (2006):
Web Corpus Mining by Instance of Wikipedia.
In: Proceedings of the EACL 2006 Workshop on Web as Corpus, Trento, Italy, [Conference or Workshop Item]

Abstract

In this paper we present an approach on structure learning in the area of web documents. This is done in order to approach the goal of webgenre tagging in the area of web corpus linguistics. A central outcome of the paper is that purely structure oriented approaches to web document classification provide an information gain which may be utilized in combined approaches of web content and structure analysis.

Item Type: Conference or Workshop Item
Erschienen: 2006
Creators: Gleim, Rüdiger and Mehler, Alexander and Dehmer, Matthias
Title: Web Corpus Mining by Instance of Wikipedia
Language: German
Abstract:

In this paper we present an approach on structure learning in the area of web documents. This is done in order to approach the goal of webgenre tagging in the area of web corpus linguistics. A central outcome of the paper is that purely structure oriented approaches to web document classification provide an information gain which may be utilized in combined approaches of web content and structure analysis.

Title of Book: Proceedings of the EACL 2006 Workshop on Web as Corpus, Trento, Italy
Divisions: 20 Department of Computer Science > Telecooperation
20 Department of Computer Science
Date Deposited: 31 Dec 2016 12:59
Identification Number: GMD:2006
Export:

Optionen (nur für Redakteure)

View Item View Item