TU Darmstadt / ULB / TUbiblio

All-in Text: Learning Document, Label, and Word Representations Jointly

Nam, Jinseok ; Loza Mencía, Eneldo ; Fürnkranz, Johannes (2016)
All-in Text: Learning Document, Label, and Word Representations Jointly.
Thirtieth AAAI Conference on Artificial Intelligence.
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Conventional multi-label classification algorithms treat the target labels of the classification task as mere symbols that are void of an inherent semantics. However, in many cases textual descriptions of these labels are available or can be easily constructed from public document sources such as Wikipedia. In this paper, we investigate an approach for embedding documents and labels into a joint space while sharing word representations between documents and labels. For finding such embeddings, we rely on the text of documents as well as descriptions for the labels. The use of such label descriptions not only lets us expect an increased performance on conventional multi-label text classification tasks, but can also be used to make predictions for labels that have not been seen during the training phase. The potential of our method is demonstrated on the multi-label classification task of assigning keywords from the Medical Subject Headings (MeSH) to publications in biomedical research, both in a conventional and in a zero-shot learning setting.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2016
Autor(en): Nam, Jinseok ; Loza Mencía, Eneldo ; Fürnkranz, Johannes
Art des Eintrags: Bibliographie
Titel: All-in Text: Learning Document, Label, and Word Representations Jointly
Sprache: Englisch
Publikationsjahr: 2016
Buchtitel: Proceedings of the AAAI Conference on Artificial Intelligence
Veranstaltungstitel: Thirtieth AAAI Conference on Artificial Intelligence
URL / URN: https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12...
Kurzbeschreibung (Abstract):

Conventional multi-label classification algorithms treat the target labels of the classification task as mere symbols that are void of an inherent semantics. However, in many cases textual descriptions of these labels are available or can be easily constructed from public document sources such as Wikipedia. In this paper, we investigate an approach for embedding documents and labels into a joint space while sharing word representations between documents and labels. For finding such embeddings, we rely on the text of documents as well as descriptions for the labels. The use of such label descriptions not only lets us expect an increased performance on conventional multi-label text classification tasks, but can also be used to make predictions for labels that have not been seen during the training phase. The potential of our method is demonstrated on the multi-label classification task of assigning keywords from the Medical Subject Headings (MeSH) to publications in biomedical research, both in a conventional and in a zero-shot learning setting.

Freie Schlagworte: Knowledge Discovery in Scientific Literature
ID-Nummer: TUD-CS-2016-0005
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Knowledge Engineering
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
DFG-Graduiertenkollegs
DFG-Graduiertenkollegs > Graduiertenkolleg 1994 Adaptive Informationsaufbereitung aus heterogenen Quellen
Hinterlegungsdatum: 31 Dez 2016 00:25
Letzte Änderung: 13 Dez 2018 17:12
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen