Mehler, Alexander ; Dehmer, Matthias ; Gleim, Rüdiger (2004)
Towards Logical Hypertext Structure. A Graph-Theoretic Perspective.
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
Facing the retrieval problem according to the overwhelming set of documents online the adaptation of text categorization to web units has recently been pushed. The aim is to utilize categories of web sites and pages as an additional retrieval criterion. In this context, the bag-of-words model has been utilized just as HTML tags and link structures. In spite of promising results this adaptation stays in the framework of IR specific models since it neglects the content-based structuring inherent to hypertext units. This paper approaches hypertext modelling from the perspective of graphtheory. It presents an XML-based format for representing websites as hypergraphs. These hypergraphs are used to shed light on the relation of hypertext structure types and their web-based instances. We place emphasis on two characteristics of this relation: In terms of realizational ambiguity we speak of functional equivalents to the manifestation of the same structure type. In terms of polymorphism we speak of a single web unit which manifests different structure types. It is shown that polymorphism is a prevalent characteristic of web-based units. This is done by means of a categorization experiment which analyses a corpus of hypergraphs representing the structure and content of pages of conference websites. On this background we plead for a revision of text representation models by means of hypergraphs which are sensitive to the manifold structuring of web documents.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2004 |
Autor(en): | Mehler, Alexander ; Dehmer, Matthias ; Gleim, Rüdiger |
Art des Eintrags: | Bibliographie |
Titel: | Towards Logical Hypertext Structure. A Graph-Theoretic Perspective |
Sprache: | Deutsch |
Publikationsjahr: | 2004 |
Verlag: | Springer |
Buchtitel: | Proceedings of I2CS 04 - Innovative Internet Community Systems |
Kurzbeschreibung (Abstract): | Facing the retrieval problem according to the overwhelming set of documents online the adaptation of text categorization to web units has recently been pushed. The aim is to utilize categories of web sites and pages as an additional retrieval criterion. In this context, the bag-of-words model has been utilized just as HTML tags and link structures. In spite of promising results this adaptation stays in the framework of IR specific models since it neglects the content-based structuring inherent to hypertext units. This paper approaches hypertext modelling from the perspective of graphtheory. It presents an XML-based format for representing websites as hypergraphs. These hypergraphs are used to shed light on the relation of hypertext structure types and their web-based instances. We place emphasis on two characteristics of this relation: In terms of realizational ambiguity we speak of functional equivalents to the manifestation of the same structure type. In terms of polymorphism we speak of a single web unit which manifests different structure types. It is shown that polymorphism is a prevalent characteristic of web-based units. This is done by means of a categorization experiment which analyses a corpus of hypergraphs representing the structure and content of pages of conference websites. On this background we plead for a revision of text representation models by means of hypergraphs which are sensitive to the manifold structuring of web documents. |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Telekooperation |
Hinterlegungsdatum: | 31 Dez 2016 12:59 |
Letzte Änderung: | 03 Jun 2018 21:29 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |