TU Darmstadt / ULB / TUbiblio

Unsupervised Cue-words Discovery for Tag-sense Disambiguation: Comparing Dissimilarity Metrics

Legesse, Meshesha ; Gianini, Gabriele ; Teferi, Dereje ; Mousselly-Sergieh, Hatem ; Coquil, David ; Egyed-Zsigmond, Elöd (2015)
Unsupervised Cue-words Discovery for Tag-sense Disambiguation: Comparing Dissimilarity Metrics.
Caraguatatuba, Brazil
doi: 10.1145/2857218.2857222
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Although tagging simplifies resource browsing and retrieval, it suffers from several issues: among them are redundancy and ambiguity. In this work we focus on the problem of resolving tag word-sense ambiguity within a typical semi-automatic tagging procedure. In that process a user proposes a tag for a resource, if the tag is found to be related to more than one context, she is provided with two or more cues among which to choose, so as to remove the tag ambiguity. Key phases, in such a disambiguation procedure, are ambiguous tag detection and cue discovery. Both should rely on effective word-to-context relatedness metrics. Among the most effective relatedness metrics are those defined on the basis of a feature vector representation of the words. In this work we compare different word-to-context relatedness metrics in terms of effectiveness within the disambiguation process. We propose to use a metrics derived from a Maximum Likelihood estimator of the Jensen-Shannon Divergence among feature-count histograms and we show that such a metrics performs -- in terms of quality of the output -- better than both the Jensen-Shannon and the Symmetrized Kullback-Leibler divergence between histograms. We study the relative gain in quality within the task of unsupervised cue discovery by using a synthetic language corpus.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2015
Autor(en): Legesse, Meshesha ; Gianini, Gabriele ; Teferi, Dereje ; Mousselly-Sergieh, Hatem ; Coquil, David ; Egyed-Zsigmond, Elöd
Art des Eintrags: Bibliographie
Titel: Unsupervised Cue-words Discovery for Tag-sense Disambiguation: Comparing Dissimilarity Metrics
Sprache: Englisch
Publikationsjahr: 2015
Verlag: ACM
Buchtitel: Proceedings of the 7th International Conference on Management of Computational and Collective intElligence in Digital EcoSystems
Veranstaltungsort: Caraguatatuba, Brazil
DOI: 10.1145/2857218.2857222
URL / URN: https://dl.acm.org/citation.cfm?id=2857222&dl=ACM&coll=DL
Kurzbeschreibung (Abstract):

Although tagging simplifies resource browsing and retrieval, it suffers from several issues: among them are redundancy and ambiguity. In this work we focus on the problem of resolving tag word-sense ambiguity within a typical semi-automatic tagging procedure. In that process a user proposes a tag for a resource, if the tag is found to be related to more than one context, she is provided with two or more cues among which to choose, so as to remove the tag ambiguity. Key phases, in such a disambiguation procedure, are ambiguous tag detection and cue discovery. Both should rely on effective word-to-context relatedness metrics. Among the most effective relatedness metrics are those defined on the basis of a feature vector representation of the words. In this work we compare different word-to-context relatedness metrics in terms of effectiveness within the disambiguation process. We propose to use a metrics derived from a Maximum Likelihood estimator of the Jensen-Shannon Divergence among feature-count histograms and we show that such a metrics performs -- in terms of quality of the output -- better than both the Jensen-Shannon and the Symmetrized Kullback-Leibler divergence between histograms. We study the relative gain in quality within the task of unsupervised cue discovery by using a synthetic language corpus.

Freie Schlagworte: Jensen-Shannon divergence, disambiguation, dissimilarity metrics, retrieval models and ranking, semantic relatedness, similarity measures, tagging
ID-Nummer: TUD-CS-2015-12061
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 31 Dez 2016 14:29
Letzte Änderung: 18 Sep 2018 10:45
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen