TU Darmstadt / ULB / TUbiblio

Dior-CVAE: Pre-trained Language Models and Diffusion Priors for Variational Dialog Generation

Yang, Tianyu ; Tran, Thy Thy ; Gurevych, Iryna (2023)
Dior-CVAE: Pre-trained Language Models and Diffusion Priors for Variational Dialog Generation.
2023 Conference on Empirical Methods in Natural Language Processing. Singapore (06.12.2023-10.12.2023)
doi: 10.18653/v1/2023.findings-emnlp.313
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Current variational dialog models have employed pre-trained language models (PLMs) to parameterize the likelihood and posterior distributions. However, the Gaussian assumption made on the prior distribution is incompatible with these distributions, thus restricting the diversity of generated responses. These models also suffer from posterior collapse, i.e., the decoder tends to ignore latent variables and directly access information captured in the encoder through the cross-attention mechanism. In this work, we propose Dior-CVAE, a hierarchical conditional variational autoencoder (CVAE) with diffusion priors to address these challenges. We employ a diffusion model to increase the complexity of the prior distribution and its compatibility with the distributions produced by a PLM. Also, we propose memory dropout to the cross-attention mechanism, which actively encourages the use of latent variables for response generation. Overall, experiments across two commonly used open-domain dialog datasets show that our method can generate more diverse responses without large-scale dialog pre-training. Code is available at https://github.com/UKPLab/dior-cvae.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2023
Autor(en): Yang, Tianyu ; Tran, Thy Thy ; Gurevych, Iryna
Art des Eintrags: Bibliographie
Titel: Dior-CVAE: Pre-trained Language Models and Diffusion Priors for Variational Dialog Generation
Sprache: Englisch
Publikationsjahr: Dezember 2023
Ort: Singapore
Verlag: Association for Computational Linguistics
Buchtitel: Findings of the Association for Computational Linguistics: EMNLP 2023
Veranstaltungstitel: 2023 Conference on Empirical Methods in Natural Language Processing
Veranstaltungsort: Singapore
Veranstaltungsdatum: 06.12.2023-10.12.2023
DOI: 10.18653/v1/2023.findings-emnlp.313
URL / URN: https://aclanthology.org/2023.findings-emnlp.313/
Kurzbeschreibung (Abstract):

Current variational dialog models have employed pre-trained language models (PLMs) to parameterize the likelihood and posterior distributions. However, the Gaussian assumption made on the prior distribution is incompatible with these distributions, thus restricting the diversity of generated responses. These models also suffer from posterior collapse, i.e., the decoder tends to ignore latent variables and directly access information captured in the encoder through the cross-attention mechanism. In this work, we propose Dior-CVAE, a hierarchical conditional variational autoencoder (CVAE) with diffusion priors to address these challenges. We employ a diffusion model to increase the complexity of the prior distribution and its compatibility with the distributions produced by a PLM. Also, we propose memory dropout to the cross-attention mechanism, which actively encourages the use of latent variables for response generation. Overall, experiments across two commonly used open-domain dialog datasets show that our method can generate more diverse responses without large-scale dialog pre-training. Code is available at https://github.com/UKPLab/dior-cvae.

Freie Schlagworte: UKP_p_SERMAS
Fachbereich(e)/-gebiet(e): 20 Fachbereich Informatik
20 Fachbereich Informatik > Ubiquitäre Wissensverarbeitung
Hinterlegungsdatum: 18 Jan 2024 13:52
Letzte Änderung: 22 Mär 2024 07:49
PPN: 516499734
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen