Guo, Dalin ; Yu, Angela J. (2021)
Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World.
In: Proceedings of the Annual Meeting of the Cognitive Science Society, 43
Artikel, Bibliographie
Kurzbeschreibung (Abstract)
Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, multi-armed bandit, has shown humans to exhibit an "uncertainty bonus", which combines with estimated reward to drive exploration. However, previous studies often modeled belief updating using either a Bayesian model that assumed the reward contingency to remain stationary, or a reinforcement learning model. Separately, we previously showed that human learning in the bandit task is best captured by a dynamic-belief Bayesian model. We hypothesize that the estimated uncertainty bonus may depend on which learning model is employed. Here, we re-analyze a bandit dataset using all three learning models. We find that the dynamic-belief model captures human choice behavior best, while also uncovering a much larger uncertainty bonus than the other models. More broadly, our results also emphasize the importance of an appropriate learning model, as it is crucial for correctly characterizing the processes underlying human decision making.
Typ des Eintrags: | Artikel |
---|---|
Erschienen: | 2021 |
Autor(en): | Guo, Dalin ; Yu, Angela J. |
Art des Eintrags: | Bibliographie |
Titel: | Revisiting the Role of Uncertainty-Driven Exploration in a (Perceived) Non-Stationary World |
Sprache: | Englisch |
Publikationsjahr: | 2021 |
Verlag: | eScholarship Publishing |
Titel der Zeitschrift, Zeitung oder Schriftenreihe: | Proceedings of the Annual Meeting of the Cognitive Science Society |
Jahrgang/Volume einer Zeitschrift: | 43 |
URL / URN: | https://escholarship.org/uc/item/8xd759xp |
Kurzbeschreibung (Abstract): | Humans are often faced with an exploration-versus-exploitation trade-off. A commonly used paradigm, multi-armed bandit, has shown humans to exhibit an "uncertainty bonus", which combines with estimated reward to drive exploration. However, previous studies often modeled belief updating using either a Bayesian model that assumed the reward contingency to remain stationary, or a reinforcement learning model. Separately, we previously showed that human learning in the bandit task is best captured by a dynamic-belief Bayesian model. We hypothesize that the estimated uncertainty bonus may depend on which learning model is employed. Here, we re-analyze a bandit dataset using all three learning models. We find that the dynamic-belief model captures human choice behavior best, while also uncovering a much larger uncertainty bonus than the other models. More broadly, our results also emphasize the importance of an appropriate learning model, as it is crucial for correctly characterizing the processes underlying human decision making. |
Fachbereich(e)/-gebiet(e): | 03 Fachbereich Humanwissenschaften 03 Fachbereich Humanwissenschaften > Institut für Psychologie |
Hinterlegungsdatum: | 27 Okt 2023 12:08 |
Letzte Änderung: | 30 Okt 2023 06:35 |
PPN: | 512752311 |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |