Erbs, Nicolai ; Gurevych, Iryna ; Zesch, Torsten
Hrsg.: Angelova, Galia ; Bontcheva, Kalina ; Mitkov, Ruslan (2013)
Hierarchy Identification for Automatically Generating Table-of-Contents.
Hissar, Bulgaria
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
A table-of-contents (TOC) provides a quick reference to a document’s content and structure. We present the first study on identifying the hierarchical structure for automatically generating a TOC using only textual features instead of structural hints e.g. from HTML-tags. We create two new datasets to evaluate our approaches for hierarchy identification. We find that our algorithm performs on a level that is sufficient for a fully automated system. For documents without given segment titles, we extend out work by auto matically generating segment titles. We make the datasets and our experimental framework publicly available in order to foster future research in TOC generation.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2013 |
Herausgeber: | Angelova, Galia ; Bontcheva, Kalina ; Mitkov, Ruslan |
Autor(en): | Erbs, Nicolai ; Gurevych, Iryna ; Zesch, Torsten |
Art des Eintrags: | Bibliographie |
Titel: | Hierarchy Identification for Automatically Generating Table-of-Contents |
Sprache: | Englisch |
Publikationsjahr: | September 2013 |
Verlag: | INCOMA Ltd. |
Buchtitel: | Proceedings of 9th Conference on Recent Advances in Natural Language Processing (RANLP 2013) |
Veranstaltungsort: | Hissar, Bulgaria |
URL / URN: | http://www.aclweb.org/anthology/R13-1033 |
Kurzbeschreibung (Abstract): | A table-of-contents (TOC) provides a quick reference to a document’s content and structure. We present the first study on identifying the hierarchical structure for automatically generating a TOC using only textual features instead of structural hints e.g. from HTML-tags. We create two new datasets to evaluate our approaches for hierarchy identification. We find that our algorithm performs on a level that is sufficient for a fully automated system. For documents without given segment titles, we extend out work by auto matically generating segment titles. We make the datasets and our experimental framework publicly available in order to foster future research in TOC generation. |
Freie Schlagworte: | Knowledge Discovery in Scientific Literature;UKP_a_NLP4Wikis;UKP_p_WIWEB;UKP_p_WIKULU;reviewed;UKP_s_JWPL;UKP_s_DKPro_Lab;UKP_s_DKPro_Core;UKP_p_openwindow |
ID-Nummer: | TUD-CS-2013-0198 |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung |
Hinterlegungsdatum: | 31 Dez 2016 14:29 |
Letzte Änderung: | 24 Jan 2020 12:03 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |