Devaluation of Unchosen Options: A Bayesian Account of the Provenance and Maintenance of Overly Optimistic Expectations

Zhou, Corey Yishan ; Guo, Dalin ; Yu, Angela J. (2020)
Devaluation of Unchosen Options: A Bayesian Account of the Provenance and Maintenance of Overly Optimistic Expectations.
42nd Annual Meeting of the Cognitive Science Society (CogSci 2020). Online (ursprünglich Toronto, Canada) (29 July - 1 August 2020)
Konferenzveröffentlichung, Bibliographie

URL / URN: https://cognitivesciencesociety.org/wp-content/uploads/2022/...

Kurzbeschreibung (Abstract)

Humans frequently overestimate the likelihood of desirable events while underestimating the likelihood of undesirable ones: a phenomenon known as unrealistic optimism. Previously, it was suggested that unrealistic optimism arises from asymmetric belief updating, with a relatively reduced coding of undesirable information. Prior studies have shown that a reinforcement learning (RL) model with asymmetric learning rates (greater for a positive prediction error than a negative prediction error) could account for unrealistic optimism in a bandit task, in particular the tendency of human subjects to persistently choosing a single option when there are multiple equally good options. Here, we propose an alternative explanation of such persistent behavior, by modeling human behavior using a Bayesian hidden Markov model, the Dynamic Belief Model (DBM). We find that DBM captures human choice behavior better than the previously proposed asymmetric RL model. Whereas asymmetric RL attains a measure of optimism by giving better-than-expected outcomes higher learning weights compared to worse-than-expected outcomes, DBM does so by progressively devaluing the unchosen options, thus placing a greater emphasis on choice history independent of reward outcome (e.g. an oft-chosen option might continue to be preferred even if it has not been particularly rewarding), which has broadly been shown to underlie sequential effects in a variety of behavioral settings. Moreover, previous work showed that the devaluation of unchosen options in DBM helps to compensate for a default assumption of environmental non-stationarity, thus allowing the decision-maker to both be more adaptive in changing environments and still obtain near-optimal performance in stationary environments. Thus, the current work suggests both a novel rationale and mechanism for persistent behavior in bandit tasks.

Typ des Eintrags:	Konferenzveröffentlichung
Erschienen:	2020
Autor(en):	Zhou, Corey Yishan ; Guo, Dalin ; Yu, Angela J.
Art des Eintrags:	Bibliographie
Titel:	Devaluation of Unchosen Options: A Bayesian Account of the Provenance and Maintenance of Overly Optimistic Expectations
Sprache:	Englisch
Publikationsjahr:	2020
Ort:	Red Hook, NY
Verlag:	Curran Associates, Inc.
Titel der Zeitschrift, Zeitung oder Schriftenreihe:	42nd Annual Meeting of the Cognitive Science Society (CogSci 2020)
Jahrgang/Volume einer Zeitschrift:	42
Buchtitel:	Proceedings of the Annual Meeting of the Cognitive Science Society
Veranstaltungstitel:	42nd Annual Meeting of the Cognitive Science Society (CogSci 2020)
Veranstaltungsort:	Online (ursprünglich Toronto, Canada)
Veranstaltungsdatum:	29 July - 1 August 2020
URL / URN:	https://cognitivesciencesociety.org/wp-content/uploads/2022/...
Kurzbeschreibung (Abstract):	Humans frequently overestimate the likelihood of desirable events while underestimating the likelihood of undesirable ones: a phenomenon known as unrealistic optimism. Previously, it was suggested that unrealistic optimism arises from asymmetric belief updating, with a relatively reduced coding of undesirable information. Prior studies have shown that a reinforcement learning (RL) model with asymmetric learning rates (greater for a positive prediction error than a negative prediction error) could account for unrealistic optimism in a bandit task, in particular the tendency of human subjects to persistently choosing a single option when there are multiple equally good options. Here, we propose an alternative explanation of such persistent behavior, by modeling human behavior using a Bayesian hidden Markov model, the Dynamic Belief Model (DBM). We find that DBM captures human choice behavior better than the previously proposed asymmetric RL model. Whereas asymmetric RL attains a measure of optimism by giving better-than-expected outcomes higher learning weights compared to worse-than-expected outcomes, DBM does so by progressively devaluing the unchosen options, thus placing a greater emphasis on choice history independent of reward outcome (e.g. an oft-chosen option might continue to be preferred even if it has not been particularly rewarding), which has broadly been shown to underlie sequential effects in a variety of behavioral settings. Moreover, previous work showed that the devaluation of unchosen options in DBM helps to compensate for a default assumption of environmental non-stationarity, thus allowing the decision-maker to both be more adaptive in changing environments and still obtain near-optimal performance in stationary environments. Thus, the current work suggests both a novel rationale and mechanism for persistent behavior in bandit tasks.
Fachbereich(e)/-gebiet(e):	03 Fachbereich Humanwissenschaften 03 Fachbereich Humanwissenschaften > Institut für Psychologie
Hinterlegungsdatum:	06 Nov 2023 13:53
Letzte Änderung:	07 Nov 2023 12:04
PPN:	512974039
Export:

Suche nach Titel in:	TUfind oder in Google

Frage zum Eintrag

Optionen (nur für Redakteure)

Redaktionelle Details anzeigen

OAI 2.0-Basis-URL: https://tubiblio.ulb.tu-darmstadt.de/cgi/oai2 TUbiblio verwendet EPrints 3.

Drucken |

Impressum |

Datenschutzerklärung