TU Darmstadt / ULB / TUbiblio

AmbiFC: Fact-Checking Ambiguous Claims with Evidence

Glockner, Max ; Staliūnaitė, Ieva ; Thorne, James ; Vallejo, Gisela ; Vlachos, Andreas ; Gurevych, Iryna (2024)
AmbiFC: Fact-Checking Ambiguous Claims with Evidence.
In: Transactions of the Association for Computational Linguistics, 12
doi: 10.1162/tacl_a_00629
Artikel, Bibliographie

Kurzbeschreibung (Abstract)

Automated fact-checking systems verify claims against evidence to predict their veracity. In real-world scenarios, the retrieved evidence may not unambiguously support or refute the claim and yield conflicting but valid interpretations. Existing fact-checking datasets assume that the models developed with them predict a single veracity label for each claim, thus discouraging the handling of such ambiguity. To address this issue we present AmbiFC,1 a fact-checking dataset with 10k claims derived from real-world information needs. It contains fine-grained evidence annotations of 50k passages from 5k Wikipedia pages. We analyze the disagreements arising from ambiguity when comparing claims against evidence in AmbiFC, observing a strong correlation of annotator disagreement with linguistic phenomena such as underspecification and probabilistic reasoning. We develop models for predicting veracity handling this ambiguity via soft labels, and find that a pipeline that learns the label distribution for sentence-level evidence selection and veracity prediction yields the best performance. We compare models trained on different subsets of AmbiFC and show that models trained on the ambiguous instances perform better when faced with the identified linguistic phenomena.

Typ des Eintrags: Artikel
Erschienen: 2024
Autor(en): Glockner, Max ; Staliūnaitė, Ieva ; Thorne, James ; Vallejo, Gisela ; Vlachos, Andreas ; Gurevych, Iryna
Art des Eintrags: Bibliographie
Titel: AmbiFC: Fact-Checking Ambiguous Claims with Evidence
Sprache: Englisch
Publikationsjahr: 2024
Verlag: MIT Press
Titel der Zeitschrift, Zeitung oder Schriftenreihe: Transactions of the Association for Computational Linguistics
Jahrgang/Volume einer Zeitschrift: 12
DOI: 10.1162/tacl_a_00629
URL / URN: https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00629...
Kurzbeschreibung (Abstract):

Automated fact-checking systems verify claims against evidence to predict their veracity. In real-world scenarios, the retrieved evidence may not unambiguously support or refute the claim and yield conflicting but valid interpretations. Existing fact-checking datasets assume that the models developed with them predict a single veracity label for each claim, thus discouraging the handling of such ambiguity. To address this issue we present AmbiFC,1 a fact-checking dataset with 10k claims derived from real-world information needs. It contains fine-grained evidence annotations of 50k passages from 5k Wikipedia pages. We analyze the disagreements arising from ambiguity when comparing claims against evidence in AmbiFC, observing a strong correlation of annotator disagreement with linguistic phenomena such as underspecification and probabilistic reasoning. We develop models for predicting veracity handling this ambiguity via soft labels, and find that a pipeline that learns the label distribution for sentence-level evidence selection and veracity prediction yields the best performance. We compare models trained on different subsets of AmbiFC and show that models trained on the ambiguous instances perform better when faced with the identified linguistic phenomena.

Freie Schlagworte: UKP_p_seditrah_factcheck
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 18 Jan 2024 14:21
Letzte Änderung: 17 Apr 2024 10:13
PPN: 517201445
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen