TU Darmstadt / ULB / TUbiblio

Tag Similarity in Folksonomies

Sergieh, Hatem Mousselly ; Egyed-Zsigmond, Elöd ; Gianini, Gabriele ; Döller, Mario ; Kosch, Harald ; Pinon, Jean-Marie (2013)
Tag Similarity in Folksonomies.
Paris, France
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Folksonomies - collections of user-contributed tags, proved to be efficient in reducing the inherent semantic gap. However, user tags are noisy; thus, they need to be processed before they can be used by further applications. In this paper, we propose an approach for bootstrapping semantics from folksonomy tags. Our goal is to automatically identify semantically related tags. The approach is based on creating probability distribution for each tag based on co-occurrence statistics. Subsequently, the similarity between two tags is determined by the distance between their corresponding probability distributions. For this purpose, we propose an extension for the well-known Jensen-Shannon Divergence. We compared our approach to a widely used method for identifying similar tags based on the cosine measure. The evaluation shows promising results and emphasizes the advantage of our approach.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2013
Autor(en): Sergieh, Hatem Mousselly ; Egyed-Zsigmond, Elöd ; Gianini, Gabriele ; Döller, Mario ; Kosch, Harald ; Pinon, Jean-Marie
Art des Eintrags: Bibliographie
Titel: Tag Similarity in Folksonomies
Sprache: Englisch
Publikationsjahr: Mai 2013
Buchtitel: Actes du XXXIème Congrès INFORSID, Paris, France, 29-31 Mai 2013.
Veranstaltungsort: Paris, France
URL / URN: https://liris.cnrs.fr/Documents/Liris-6007.pdf
Kurzbeschreibung (Abstract):

Folksonomies - collections of user-contributed tags, proved to be efficient in reducing the inherent semantic gap. However, user tags are noisy; thus, they need to be processed before they can be used by further applications. In this paper, we propose an approach for bootstrapping semantics from folksonomy tags. Our goal is to automatically identify semantically related tags. The approach is based on creating probability distribution for each tag based on co-occurrence statistics. Subsequently, the similarity between two tags is determined by the distance between their corresponding probability distributions. For this purpose, we propose an extension for the well-known Jensen-Shannon Divergence. We compared our approach to a widely used method for identifying similar tags based on the cosine measure. The evaluation shows promising results and emphasizes the advantage of our approach.

ID-Nummer: TUD-CS-2013-0453
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 31 Dez 2016 14:29
Letzte Änderung: 20 Sep 2018 15:50
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen