TU Darmstadt / ULB / TUbiblio

Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors

Daheim, Nico ; Macina, Jakub ; Kapur, Manu ; Gurevych, Iryna ; Sachan, Mrinmaya (2024)
Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors.
29th Conference on Empirical Methods in Natural Language Processing. Miami, USA (12.11.2024 - 16.11.2024)
doi: 10.18653/v1/2024.emnlp-main.478
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Large language models (LLMs) offer many opportunities to scale high-quality personalized tutoring. A promising approach is to build dialog tutoring models to scaffold students’ problem-solving. However, even though existing models perform well in solving reasoning questions, they can struggle to precisely detect student’s errors and tailor their feedback to these errors. Inspired by real-world teaching practice where teachers identify student errors and customize their response based on them, we focus on verifying student solutions and show how grounding to such verification improves the overall quality of tutor response generation. We collect a dataset of 1,002 stepwise math reasoning chains with the first error step annotated by teachers. We show empirically that finding the mistake in a student solution is challenging for current models. We propose and evaluate several verifiers for detecting these errors. Using both automatic and human evaluation we show that the student solution verifiers steer the generation model towards highly targeted responses to student error which are more often correct with less hallucinations compared to existing baselines. The benchmark dataset and code will be released openly.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2024
Autor(en): Daheim, Nico ; Macina, Jakub ; Kapur, Manu ; Gurevych, Iryna ; Sachan, Mrinmaya
Art des Eintrags: Bibliographie
Titel: Stepwise Verification and Remediation of Student Reasoning Errors with Large Language Model Tutors
Sprache: Englisch
Publikationsjahr: November 2024
Verlag: ACL
Buchtitel: EMNLP 2024: The 2024 Conference on Empirical Methods in Natural Language Processing: Proceedings of the Conference
Veranstaltungstitel: 29th Conference on Empirical Methods in Natural Language Processing
Veranstaltungsort: Miami, USA
Veranstaltungsdatum: 12.11.2024 - 16.11.2024
DOI: 10.18653/v1/2024.emnlp-main.478
URL / URN: https://aclanthology.org/2024.emnlp-main.478/
Kurzbeschreibung (Abstract):

Large language models (LLMs) offer many opportunities to scale high-quality personalized tutoring. A promising approach is to build dialog tutoring models to scaffold students’ problem-solving. However, even though existing models perform well in solving reasoning questions, they can struggle to precisely detect student’s errors and tailor their feedback to these errors. Inspired by real-world teaching practice where teachers identify student errors and customize their response based on them, we focus on verifying student solutions and show how grounding to such verification improves the overall quality of tutor response generation. We collect a dataset of 1,002 stepwise math reasoning chains with the first error step annotated by teachers. We show empirically that finding the mistake in a student solution is challenging for current models. We propose and evaluate several verifiers for detecting these errors. Using both automatic and human evaluation we show that the student solution verifiers steer the generation model towards highly targeted responses to student error which are more often correct with less hallucinations compared to existing baselines. The benchmark dataset and code will be released openly.

Freie Schlagworte: UKP_p_seditrah_factcheck, UKP_p_crisp_senpai
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 09 Dez 2024 13:01
Letzte Änderung: 09 Dez 2024 13:01
PPN:
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen