Steitz, Jan-Martin O. ; Roth, Stefan (2024)
Adapters Strike Back.
2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Seattle, USA (16.06.2024-22.06.2024)
doi: 10.1109/CVPR52733.2024.02213
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
Adapters provide an efficient and lightweight mechanism for adapting trained transformer models to a variety of dif-ferent tasks. However, they have often been found to be outperformed by other adaptation mechanisms including low-rank adaptation. In this paper, we provide an in-depth study of adapters, their internal structure, as well as vari-ous implementation choices. We uncover pitfalls for using adapters and suggest a concrete, improved adapter architecture, called Adapter+, that not only outperforms previous adapter implementations but surpasses a number of other, more complex adaptation mechanisms in several challenging settings. Despite this, our suggested adapter is highly robust and, unlike previous work, requires little to no manual inter-vention when addressing a novel scenario. Adapter+ reaches state-of-the-art average accuracy on the VTAB benchmark, even without a per-task hyperparameter optimization. ††Code is available at https://github.com/visinf/adapter_plus.
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2024 |
Autor(en): | Steitz, Jan-Martin O. ; Roth, Stefan |
Art des Eintrags: | Bibliographie |
Titel: | Adapters Strike Back |
Sprache: | Englisch |
Publikationsjahr: | 16 September 2024 |
Verlag: | IEEE |
Buchtitel: | Proceedings: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition: CVPR 2024 |
Veranstaltungstitel: | 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition |
Veranstaltungsort: | Seattle, USA |
Veranstaltungsdatum: | 16.06.2024-22.06.2024 |
DOI: | 10.1109/CVPR52733.2024.02213 |
Kurzbeschreibung (Abstract): | Adapters provide an efficient and lightweight mechanism for adapting trained transformer models to a variety of dif-ferent tasks. However, they have often been found to be outperformed by other adaptation mechanisms including low-rank adaptation. In this paper, we provide an in-depth study of adapters, their internal structure, as well as vari-ous implementation choices. We uncover pitfalls for using adapters and suggest a concrete, improved adapter architecture, called Adapter+, that not only outperforms previous adapter implementations but surpasses a number of other, more complex adaptation mechanisms in several challenging settings. Despite this, our suggested adapter is highly robust and, unlike previous work, requires little to no manual inter-vention when addressing a novel scenario. Adapter+ reaches state-of-the-art average accuracy on the VTAB benchmark, even without a per-task hyperparameter optimization. ††Code is available at https://github.com/visinf/adapter_plus. |
Freie Schlagworte: | emergenCITY_INF, emergenCITY |
Fachbereich(e)/-gebiet(e): | 20 Fachbereich Informatik 20 Fachbereich Informatik > Visuelle Inferenz LOEWE LOEWE > LOEWE-Zentren LOEWE > LOEWE-Zentren > emergenCITY |
Hinterlegungsdatum: | 15 Jan 2025 12:54 |
Letzte Änderung: | 15 Jan 2025 12:54 |
PPN: | |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |