Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers

Wang, Yuxia ; Reddy, Revanth Gangi ; Mujahid, Zain Muhammad ; Arora, Arnav ; Rubashevskii, Aleksandr ; Geng, Jiahui ; Afzal, Osama Mohammed ; Pan, Liangming ; Borenstein, Nadav ; Pillai, Aditya ; Augenstein, Isabelle ; Gurevych, Iryna ; Nakov, Preslav (2024)
Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers.
29th Conference on Empirical Methods in Natural Language Processing. Miami, USA (12.11.2024 - 16.11.2024)
doi: 10.18653/v1/2024.findings-emnlp.830
Konferenzveröffentlichung, Bibliographie

URL / URN: https://aclanthology.org/2024.findings-emnlp.830/

Kurzbeschreibung (Abstract)

The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs. In this work, we present Factcheck-Bench, a holistic end-to-end framework for annotating and evaluating the factuality of LLM-generated responses, which encompasses a multi-stage annotation scheme designed to yield detailed labels for fact-checking and correcting not just the final prediction, but also the intermediate steps that a fact-checking system might need to take. Based on this framework, we construct an open-domain factuality benchmark in three-levels of granularity: claim, sentence, and document. We further propose a system, Factcheck-GPT, which follows our framework, and we show that it outperforms several popular LLM fact-checkers. We make our annotation tool, annotated data, benchmark, and code available at https://github.com/yuxiaw/Factcheck-GPT.

Typ des Eintrags:	Konferenzveröffentlichung
Erschienen:	2024
Autor(en):	Wang, Yuxia ; Reddy, Revanth Gangi ; Mujahid, Zain Muhammad ; Arora, Arnav ; Rubashevskii, Aleksandr ; Geng, Jiahui ; Afzal, Osama Mohammed ; Pan, Liangming ; Borenstein, Nadav ; Pillai, Aditya ; Augenstein, Isabelle ; Gurevych, Iryna ; Nakov, Preslav
Art des Eintrags:	Bibliographie
Titel:	Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers
Sprache:	Englisch
Publikationsjahr:	November 2024
Verlag:	ACL
Buchtitel:	EMNLP 2024: The 2024 Conference on Empirical Methods in Natural Language Processing: Findings of EMNLP 2024
Veranstaltungstitel:	29th Conference on Empirical Methods in Natural Language Processing
Veranstaltungsort:	Miami, USA
Veranstaltungsdatum:	12.11.2024 - 16.11.2024
DOI:	10.18653/v1/2024.findings-emnlp.830
URL / URN:	https://aclanthology.org/2024.findings-emnlp.830/
Kurzbeschreibung (Abstract):	The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs. In this work, we present Factcheck-Bench, a holistic end-to-end framework for annotating and evaluating the factuality of LLM-generated responses, which encompasses a multi-stage annotation scheme designed to yield detailed labels for fact-checking and correcting not just the final prediction, but also the intermediate steps that a fact-checking system might need to take. Based on this framework, we construct an open-domain factuality benchmark in three-levels of granularity: claim, sentence, and document. We further propose a system, Factcheck-GPT, which follows our framework, and we show that it outperforms several popular LLM fact-checkers. We make our annotation tool, annotated data, benchmark, and code available at https://github.com/yuxiaw/Factcheck-GPT.
Fachbereich(e)/-gebiet(e):	20 Fachbereich Informatik 20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum:	17 Dez 2024 11:43
Letzte Änderung:	17 Dez 2024 11:43
PPN:
Export:

Suche nach Titel in:	TUfind oder in Google

Frage zum Eintrag

Optionen (nur für Redakteure)

Redaktionelle Details anzeigen

OAI 2.0-Basis-URL: https://tubiblio.ulb.tu-darmstadt.de/cgi/oai2 TUbiblio verwendet EPrints 3.

Drucken |

Impressum |

Datenschutzerklärung