TU Darmstadt / ULB / TUbiblio

Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World

Guo, Dalin ; Yu, Angela J. (2021)
Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World.
In: Proceedings of the Annual Meeting of the Cognitive Science Society, 43
Artikel, Bibliographie

Kurzbeschreibung (Abstract)

Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, multi-armed bandit, has shown humans to exhibit an "uncertainty bonus", which combines with estimated reward to drive exploration. However, previous studies often modeled belief updating using either a Bayesian model that assumed the reward contingency to remain stationary, or a reinforcement learning model. Separately, we previously showed that human learning in the bandit task is best captured by a dynamic-belief Bayesian model. We hypothesize that the estimated uncertainty bonus may depend on which learning model is employed. Here, we re-analyze a bandit dataset using all three learning models. We find that the dynamic-belief model captures human choice behavior best, while also uncovering a much larger uncertainty bonus than the other models. More broadly, our results also emphasize the importance of an appropriate learning model, as it is crucial for correctly characterizing the processes underlying human decision making.

Typ des Eintrags: Artikel
Erschienen: 2021
Autor(en): Guo, Dalin ; Yu, Angela J.
Art des Eintrags: Bibliographie
Titel: Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World
Sprache: Englisch
Publikationsjahr: 2021
Verlag: eScholarship Publishing
Titel der Zeitschrift, Zeitung oder Schriftenreihe: Proceedings of the Annual Meeting of the Cognitive Science Society
Jahrgang/Volume einer Zeitschrift: 43
URL / URN: https://escholarship.org/uc/item/8xd759xp
Kurzbeschreibung (Abstract):

Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, multi-armed bandit, has shown humans to exhibit an "uncertainty bonus", which combines with estimated reward to drive exploration. However, previous studies often modeled belief updating using either a Bayesian model that assumed the reward contingency to remain stationary, or a reinforcement learning model. Separately, we previously showed that human learning in the bandit task is best captured by a dynamic-belief Bayesian model. We hypothesize that the estimated uncertainty bonus may depend on which learning model is employed. Here, we re-analyze a bandit dataset using all three learning models. We find that the dynamic-belief model captures human choice behavior best, while also uncovering a much larger uncertainty bonus than the other models. More broadly, our results also emphasize the importance of an appropriate learning model, as it is crucial for correctly characterizing the processes underlying human decision making.

Fachbereich(e)/-gebiet(e): 03 Fachbereich Humanwissenschaften
03 Fachbereich Humanwissenschaften > Institut für Psychologie
Hinterlegungsdatum: 27 Okt 2023 12:08
Letzte Änderung: 30 Okt 2023 06:35
PPN: 512752311
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen