TU Darmstadt / ULB / TUbiblio

FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing

Liu, Chen ; Pfeiffer, Jonas ; Vulić, Ivan ; Gurevych, Iryna (2024)
FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing.
2024 Conference of the North American Chapter of the Association for Computational Linguistics. Mexico City, Mexico (17-21.06.2024)
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Standard fine-tuning of language models typically performs well on in-distribution data, but suffers with generalization to distribution shifts. In this work, we aim to improve the generalization of adapter-based cross-lingual task transfer where such cross-language distribution shifts are imminent. We investigate scheduled unfreezing algorithms –originally proposed to mitigate catastrophic forgetting in transfer learning – for fine-tuning task adapters. Our experiments show that scheduled unfreezing methods close the gap to full fine-tuning and achieve stronger cross-lingual transfer performance, suggesting that these methods can go beyond just mitigating catastrophic forgetting. Next, aiming to understand these empirical findings, we investigate the learning dynamics of scheduled unfreezing using Fisher Information. Our experiments reveal that scheduled unfreezing induces different learning dynamics compared to standard fine-tuning, and provide evidence that the dynamics of Fisher Information during training correlate with cross-lingual generalization performance. We additionally propose a general scheduled unfreezing algorithm that achieves an average of 2 points improvement over four datasets compared to standard fine-tuning and provides empirical evidence for a theory-based justification of the heuristic unfreezing schedule for task adapter training.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2024
Autor(en): Liu, Chen ; Pfeiffer, Jonas ; Vulić, Ivan ; Gurevych, Iryna
Art des Eintrags: Bibliographie
Titel: FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing
Sprache: Englisch
Publikationsjahr: Juni 2024
Ort: Mexico City, Mexico
Verlag: Association for Computational Linguistics
Buchtitel: Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Veranstaltungstitel: 2024 Conference of the North American Chapter of the Association for Computational Linguistics
Veranstaltungsort: Mexico City, Mexico
Veranstaltungsdatum: 17-21.06.2024
URL / URN: https://aclanthology.org/2024.naacl-long.111
Kurzbeschreibung (Abstract):

Standard fine-tuning of language models typically performs well on in-distribution data, but suffers with generalization to distribution shifts. In this work, we aim to improve the generalization of adapter-based cross-lingual task transfer where such cross-language distribution shifts are imminent. We investigate scheduled unfreezing algorithms –originally proposed to mitigate catastrophic forgetting in transfer learning – for fine-tuning task adapters. Our experiments show that scheduled unfreezing methods close the gap to full fine-tuning and achieve stronger cross-lingual transfer performance, suggesting that these methods can go beyond just mitigating catastrophic forgetting. Next, aiming to understand these empirical findings, we investigate the learning dynamics of scheduled unfreezing using Fisher Information. Our experiments reveal that scheduled unfreezing induces different learning dynamics compared to standard fine-tuning, and provide evidence that the dynamics of Fisher Information during training correlate with cross-lingual generalization performance. We additionally propose a general scheduled unfreezing algorithm that achieves an average of 2 points improvement over four datasets compared to standard fine-tuning and provides empirical evidence for a theory-based justification of the heuristic unfreezing schedule for task adapter training.

Freie Schlagworte: UKP_p_MISRIK,UKP_p_emergencity
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 24 Jun 2024 11:27
Letzte Änderung: 25 Jun 2024 08:52
PPN: 519362527
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen