TU Darmstadt / ULB / TUbiblio

Towards Logical Hypertext Structure. A Graph-Theoretic Perspective

Mehler, Alexander and Dehmer, Matthias and Gleim, Rüdiger (2004):
Towards Logical Hypertext Structure. A Graph-Theoretic Perspective.
In: Proceedings of I2CS 04 - Innovative Internet Community Systems, Springer, pp. 136-150, [Conference or Workshop Item]

Abstract

Facing the retrieval problem according to the overwhelming set of documents online the adaptation of text categorization to web units has recently been pushed. The aim is to utilize categories of web sites and pages as an additional retrieval criterion. In this context, the bag-of-words model has been utilized just as HTML tags and link structures. In spite of promising results this adaptation stays in the framework of IR specific models since it neglects the content-based structuring inherent to hypertext units. This paper approaches hypertext modelling from the perspective of graphtheory. It presents an XML-based format for representing websites as hypergraphs. These hypergraphs are used to shed light on the relation of hypertext structure types and their web-based instances. We place emphasis on two characteristics of this relation: In terms of realizational ambiguity we speak of functional equivalents to the manifestation of the same structure type. In terms of polymorphism we speak of a single web unit which manifests different structure types. It is shown that polymorphism is a prevalent characteristic of web-based units. This is done by means of a categorization experiment which analyses a corpus of hypergraphs representing the structure and content of pages of conference websites. On this background we plead for a revision of text representation models by means of hypergraphs which are sensitive to the manifold structuring of web documents.

Item Type: Conference or Workshop Item
Erschienen: 2004
Creators: Mehler, Alexander and Dehmer, Matthias and Gleim, Rüdiger
Title: Towards Logical Hypertext Structure. A Graph-Theoretic Perspective
Language: German
Abstract:

Facing the retrieval problem according to the overwhelming set of documents online the adaptation of text categorization to web units has recently been pushed. The aim is to utilize categories of web sites and pages as an additional retrieval criterion. In this context, the bag-of-words model has been utilized just as HTML tags and link structures. In spite of promising results this adaptation stays in the framework of IR specific models since it neglects the content-based structuring inherent to hypertext units. This paper approaches hypertext modelling from the perspective of graphtheory. It presents an XML-based format for representing websites as hypergraphs. These hypergraphs are used to shed light on the relation of hypertext structure types and their web-based instances. We place emphasis on two characteristics of this relation: In terms of realizational ambiguity we speak of functional equivalents to the manifestation of the same structure type. In terms of polymorphism we speak of a single web unit which manifests different structure types. It is shown that polymorphism is a prevalent characteristic of web-based units. This is done by means of a categorization experiment which analyses a corpus of hypergraphs representing the structure and content of pages of conference websites. On this background we plead for a revision of text representation models by means of hypergraphs which are sensitive to the manifold structuring of web documents.

Title of Book: Proceedings of I2CS 04 - Innovative Internet Community Systems
Publisher: Springer
Divisions: 20 Department of Computer Science
20 Department of Computer Science > Telecooperation
Date Deposited: 31 Dec 2016 12:59
Identification Number: dehmertowards04
Export:
Suche nach Titel in: TUfind oder in Google
Send an inquiry Send an inquiry

Options (only for editors)
Show editorial Details Show editorial Details