TU Darmstadt / ULB / TUbiblio

Towards Logical Hypertext Structure. A Graph-Theoretic Perspective

Mehler, Alexander ; Dehmer, Matthias ; Gleim, Rüdiger (2004)
Towards Logical Hypertext Structure. A Graph-Theoretic Perspective.
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Facing the retrieval problem according to the overwhelming set of documents online the adaptation of text categorization to web units has recently been pushed. The aim is to utilize categories of web sites and pages as an additional retrieval criterion. In this context, the bag-of-words model has been utilized just as HTML tags and link structures. In spite of promising results this adaptation stays in the framework of IR specific models since it neglects the content-based structuring inherent to hypertext units. This paper approaches hypertext modelling from the perspective of graphtheory. It presents an XML-based format for representing websites as hypergraphs. These hypergraphs are used to shed light on the relation of hypertext structure types and their web-based instances. We place emphasis on two characteristics of this relation: In terms of realizational ambiguity we speak of functional equivalents to the manifestation of the same structure type. In terms of polymorphism we speak of a single web unit which manifests different structure types. It is shown that polymorphism is a prevalent characteristic of web-based units. This is done by means of a categorization experiment which analyses a corpus of hypergraphs representing the structure and content of pages of conference websites. On this background we plead for a revision of text representation models by means of hypergraphs which are sensitive to the manifold structuring of web documents.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2004
Autor(en): Mehler, Alexander ; Dehmer, Matthias ; Gleim, Rüdiger
Art des Eintrags: Bibliographie
Titel: Towards Logical Hypertext Structure. A Graph-Theoretic Perspective
Sprache: Deutsch
Publikationsjahr: 2004
Verlag: Springer
Buchtitel: Proceedings of I2CS 04 - Innovative Internet Community Systems
Kurzbeschreibung (Abstract):

Facing the retrieval problem according to the overwhelming set of documents online the adaptation of text categorization to web units has recently been pushed. The aim is to utilize categories of web sites and pages as an additional retrieval criterion. In this context, the bag-of-words model has been utilized just as HTML tags and link structures. In spite of promising results this adaptation stays in the framework of IR specific models since it neglects the content-based structuring inherent to hypertext units. This paper approaches hypertext modelling from the perspective of graphtheory. It presents an XML-based format for representing websites as hypergraphs. These hypergraphs are used to shed light on the relation of hypertext structure types and their web-based instances. We place emphasis on two characteristics of this relation: In terms of realizational ambiguity we speak of functional equivalents to the manifestation of the same structure type. In terms of polymorphism we speak of a single web unit which manifests different structure types. It is shown that polymorphism is a prevalent characteristic of web-based units. This is done by means of a categorization experiment which analyses a corpus of hypergraphs representing the structure and content of pages of conference websites. On this background we plead for a revision of text representation models by means of hypergraphs which are sensitive to the manifold structuring of web documents.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Telekooperation
Hinterlegungsdatum: 31 Dez 2016 12:59
Letzte Änderung: 03 Jun 2018 21:29
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen