TU Darmstadt / ULB / TUbiblio

Performance Comparison of Ad-Hoc Retrieval Models over Full-Text vs. Titles of Documents

Saleh, Ahmed ; Beck, Tilman ; Galke, Lukas ; Scherp, Ansgar
Hrsg.: Dobreva, Milena ; Hinze, Annika ; Zumer, Maja (2018)
Performance Comparison of Ad-Hoc Retrieval Models over Full-Text vs. Titles of Documents.
In: Maturity and Innovation in Digital Libraries
doi: 10.1007/978-3-030-04257-8
Buchkapitel, Bibliographie

Kurzbeschreibung (Abstract)

While there are many studies on information retrieval models using full-text, there are presently no comparison studies of full-text retrieval vs. retrieval only over the titles of documents. On the one hand, the full-text of documents like scientific papers is not always available due to, e.g., copyright policies of academic publishers. On the other hand, conducting a search based on titles alone has strong limitations. Titles are short and therefore may not contain enough information to yield satisfactory search results. In this paper, we compare different retrieval models regarding their search performance on the full-text vs. only titles of documents. We use different datasets, including the three digital library datasets: EconBiz, IREON, and PubMed. The results show that it is possible to build effective title-based retrieval models that provide competitive results comparable to full-text retrieval. The difference between the average evaluation results of the best title-based retrieval models is only 3% less than those of the best full-text-based retrieval models.

Typ des Eintrags: Buchkapitel
Erschienen: 2018
Herausgeber: Dobreva, Milena ; Hinze, Annika ; Zumer, Maja
Autor(en): Saleh, Ahmed ; Beck, Tilman ; Galke, Lukas ; Scherp, Ansgar
Art des Eintrags: Bibliographie
Titel: Performance Comparison of Ad-Hoc Retrieval Models over Full-Text vs. Titles of Documents
Sprache: Englisch
Publikationsjahr: 2018
Ort: Cham
Verlag: Springer International Publishing
Buchtitel: Maturity and Innovation in Digital Libraries
Veranstaltungstitel: Maturity and Innovation in Digital Libraries
Veranstaltungsort: Cham
DOI: 10.1007/978-3-030-04257-8
URL / URN: https://www.springer.com/de/book/9783030042561
Kurzbeschreibung (Abstract):

While there are many studies on information retrieval models using full-text, there are presently no comparison studies of full-text retrieval vs. retrieval only over the titles of documents. On the one hand, the full-text of documents like scientific papers is not always available due to, e.g., copyright policies of academic publishers. On the other hand, conducting a search based on titles alone has strong limitations. Titles are short and therefore may not contain enough information to yield satisfactory search results. In this paper, we compare different retrieval models regarding their search performance on the full-text vs. only titles of documents. We use different datasets, including the three digital library datasets: EconBiz, IREON, and PubMed. The results show that it is possible to build effective title-based retrieval models that provide competitive results comparable to full-text retrieval. The difference between the average evaluation results of the best title-based retrieval models is only 3% less than those of the best full-text-based retrieval models.

Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 22 Nov 2018 07:28
Letzte Änderung: 28 Jan 2019 08:33
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen