Wang, Yuxia ; Reddy, Revanth Gangi ; Mujahid, Zain Muhammad ; Arora, Arnav ; Rubashevskii, Aleksandr ; Geng, Jiahui ; Afzal, Osama Mohammed ; Pan, Liangming ; Borenstein, Nadav ; Pillai, Aditya ; Augenstein, Isabelle ; Gurevych, Iryna ; Nakov, Preslav (2024)
Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers.
29th Conference on Empirical Methods in Natural Language Processing. Miami, USA (12.11.2024 - 16.11.2024)
doi: 10.18653/v1/2024.findings-emnlp.830
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs. In this work, we present Factcheck-Bench, a holistic end-to-end framework for annotating and evaluating the factuality of LLM-generated responses, which encompasses a multi-stage annotation scheme designed to yield detailed labels for fact-checking and correcting not just the final prediction, but also the intermediate steps that a fact-checking system might need to take. Based on this framework, we construct an open-domain factuality benchmark in three-levels of granularity: claim, sentence, and document. We further propose a system, Factcheck-GPT, which follows our framework, and we show that it outperforms several popular LLM fact-checkers. We make our annotation tool, annotated data, benchmark, and code available at https://github.com/yuxiaw/Factcheck-GPT.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2024 |
Autor(en): | Wang, Yuxia ; Reddy, Revanth Gangi ; Mujahid, Zain Muhammad ; Arora, Arnav ; Rubashevskii, Aleksandr ; Geng, Jiahui ; Afzal, Osama Mohammed ; Pan, Liangming ; Borenstein, Nadav ; Pillai, Aditya ; Augenstein, Isabelle ; Gurevych, Iryna ; Nakov, Preslav |
Art des Eintrags: | Bibliographie |
Titel: | Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers |
Sprache: | Englisch |
Publikationsjahr: | November 2024 |
Verlag: | ACL |
Buchtitel: | EMNLP 2024: The 2024 Conference on Empirical Methods in Natural Language Processing: Findings of EMNLP 2024 |
Veranstaltungstitel: | 29th Conference on Empirical Methods in Natural Language Processing |
Veranstaltungsort: | Miami, USA |
Veranstaltungsdatum: | 12.11.2024 - 16.11.2024 |
DOI: | 10.18653/v1/2024.findings-emnlp.830 |
URL / URN: | https://aclanthology.org/2024.findings-emnlp.830/ |
Kurzbeschreibung (Abstract): | The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs. In this work, we present Factcheck-Bench, a holistic end-to-end framework for annotating and evaluating the factuality of LLM-generated responses, which encompasses a multi-stage annotation scheme designed to yield detailed labels for fact-checking and correcting not just the final prediction, but also the intermediate steps that a fact-checking system might need to take. Based on this framework, we construct an open-domain factuality benchmark in three-levels of granularity: claim, sentence, and document. We further propose a system, Factcheck-GPT, which follows our framework, and we show that it outperforms several popular LLM fact-checkers. We make our annotation tool, annotated data, benchmark, and code available at https://github.com/yuxiaw/Factcheck-GPT. |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung |
Hinterlegungsdatum: | 17 Dez 2024 11:43 |
Letzte Änderung: | 17 Dez 2024 11:43 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |