TU Darmstadt / ULB / TUbiblio

Mining Multiword Terms from Wikipedia

Hartmann, Silvana ; Szarvas, György ; Gurevych, Iryna
Hrsg.: Pazienza, Maria Teresa ; Stellato, Armando (2012)
Mining Multiword Terms from Wikipedia.
In: Semi-Automatic Ontology Development: Processes and Resources
Buchkapitel, Bibliographie

Kurzbeschreibung (Abstract)

The collection of the specialized vocabulary of a particular domain (terminology) is an important initial step of creating formalized domain knowledge representations (ontologies). Terminology extraction (TE) aims at automating this process by collecting the relevant domain vocabulary from existing lexical resources or collections of domain texts. In this chapter, we address the extraction of multiword terminology, as multiword terms are very frequent in terminology, but typically poorly represented in standard lexical resources. We present our method for mining multiword terminology from Wikipedia and the freely available terminology resource that we extracted using the presented method. Terminology extraction based on Wikipedia exploits the advantages of a huge multilingual, domain-transcending knowledge source and large scale structural information that can identify potential multiword units without the need for linguistic processing tools. Thus, while evaluated in English, the proposed method is basically applicable to all languages in Wikipedia.

Typ des Eintrags: Buchkapitel
Erschienen: 2012
Herausgeber: Pazienza, Maria Teresa ; Stellato, Armando
Autor(en): Hartmann, Silvana ; Szarvas, György ; Gurevych, Iryna
Art des Eintrags: Bibliographie
Titel: Mining Multiword Terms from Wikipedia
Sprache: Englisch
Publikationsjahr: 2012
Verlag: IGI Global
Buchtitel: Semi-Automatic Ontology Development: Processes and Resources
URL / URN: https://www.igi-global.com/chapter/mining-multiword-terms-wi...
Zugehörige Links:
Kurzbeschreibung (Abstract):

The collection of the specialized vocabulary of a particular domain (terminology) is an important initial step of creating formalized domain knowledge representations (ontologies). Terminology extraction (TE) aims at automating this process by collecting the relevant domain vocabulary from existing lexical resources or collections of domain texts. In this chapter, we address the extraction of multiword terminology, as multiword terms are very frequent in terminology, but typically poorly represented in standard lexical resources. We present our method for mining multiword terminology from Wikipedia and the freely available terminology resource that we extracted using the presented method. Terminology extraction based on Wikipedia exploits the advantages of a huge multilingual, domain-transcending knowledge source and large scale structural information that can identify potential multiword units without the need for linguistic processing tools. Thus, while evaluated in English, the proposed method is basically applicable to all languages in Wikipedia.

Freie Schlagworte: UKP_a_NLP4Wikis;UKP_p_QAEL
ID-Nummer: TUD-CS-2011-0204
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 31 Dez 2016 14:29
Letzte Änderung: 14 Nov 2023 09:52
PPN:
Zugehörige Links:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen