TU Darmstadt / ULB / TUbiblio

Devaluation of Unchosen Options: A Bayesian Account of the Provenance and Maintenance of Overly Optimistic Expectations

Zhou, Corey Yishan ; Guo, Dalin ; Yu, Angela J. (2020)
Devaluation of Unchosen Options: A Bayesian Account of the Provenance and Maintenance of Overly Optimistic Expectations.
42nd Annual Meeting of the Cognitive Science Society (CogSci 2020). Online (ursprünglich Toronto, Canada) (29.07.2020-01.08.2020)
Konferenzveröffentlichung, Bibliographie

Kurzbeschreibung (Abstract)

Humans frequently overestimate the likelihood of desirable events while underestimating the likelihood of undesirable ones: a phenomenon known as unrealistic optimism. Previously, it was suggested that unrealistic optimism arises from asymmetric belief updating, with a relatively reduced coding of undesirable information. Prior studies have shown that a reinforcement learning (RL) model with asymmetric learning rates (greater for a positive prediction error than a negative prediction error) could account for unrealistic optimism in a bandit task, in particular the tendency of human subjects to persistently choosing a single option when there are multiple equally good options. Here, we propose an alternative explanation of such persistent behavior, by modeling human behavior using a Bayesian hidden Markov model, the Dynamic Belief Model (DBM). We find that DBM captures human choice behavior better than the previously proposed asymmetric RL model. Whereas asymmetric RL attains a measure of optimism by giving better-than-expected outcomes higher learning weights compared to worse-than-expected outcomes, DBM does so by progressively devaluing the unchosen options, thus placing a greater emphasis on choice history independent of reward outcome (e.g. an oft-chosen option might continue to be preferred even if it has not been particularly rewarding), which has broadly been shown to underlie sequential effects in a variety of behavioral settings. Moreover, previous work showed that the devaluation of unchosen options in DBM helps to compensate for a default assumption of environmental non-stationarity, thus allowing the decision-maker to both be more adaptive in changing environments and still obtain near-optimal performance in stationary environments. Thus, the current work suggests both a novel rationale and mechanism for persistent behavior in bandit tasks.

Typ des Eintrags: Konferenzveröffentlichung
Erschienen: 2020
Autor(en): Zhou, Corey Yishan ; Guo, Dalin ; Yu, Angela J.
Art des Eintrags: Bibliographie
Titel: Devaluation of Unchosen Options: A Bayesian Account of the Provenance and Maintenance of Overly Optimistic Expectations
Sprache: Englisch
Publikationsjahr: 2020
Ort: Red Hook, NY
Verlag: Curran Associates, Inc.
Titel der Zeitschrift, Zeitung oder Schriftenreihe: 42nd Annual Meeting of the Cognitive Science Society (CogSci 2020)
Jahrgang/Volume einer Zeitschrift: 42
Buchtitel: Proceedings of the Annual Meeting of the Cognitive Science Society
Veranstaltungstitel: 42nd Annual Meeting of the Cognitive Science Society (CogSci 2020)
Veranstaltungsort: Online (ursprünglich Toronto, Canada)
Veranstaltungsdatum: 29.07.2020-01.08.2020
URL / URN: https://cognitivesciencesociety.org/wp-content/uploads/2022/...
Kurzbeschreibung (Abstract):

Humans frequently overestimate the likelihood of desirable events while underestimating the likelihood of undesirable ones: a phenomenon known as unrealistic optimism. Previously, it was suggested that unrealistic optimism arises from asymmetric belief updating, with a relatively reduced coding of undesirable information. Prior studies have shown that a reinforcement learning (RL) model with asymmetric learning rates (greater for a positive prediction error than a negative prediction error) could account for unrealistic optimism in a bandit task, in particular the tendency of human subjects to persistently choosing a single option when there are multiple equally good options. Here, we propose an alternative explanation of such persistent behavior, by modeling human behavior using a Bayesian hidden Markov model, the Dynamic Belief Model (DBM). We find that DBM captures human choice behavior better than the previously proposed asymmetric RL model. Whereas asymmetric RL attains a measure of optimism by giving better-than-expected outcomes higher learning weights compared to worse-than-expected outcomes, DBM does so by progressively devaluing the unchosen options, thus placing a greater emphasis on choice history independent of reward outcome (e.g. an oft-chosen option might continue to be preferred even if it has not been particularly rewarding), which has broadly been shown to underlie sequential effects in a variety of behavioral settings. Moreover, previous work showed that the devaluation of unchosen options in DBM helps to compensate for a default assumption of environmental non-stationarity, thus allowing the decision-maker to both be more adaptive in changing environments and still obtain near-optimal performance in stationary environments. Thus, the current work suggests both a novel rationale and mechanism for persistent behavior in bandit tasks.

Fachbereich(e)/-gebiet(e): 03 Fachbereich Humanwissenschaften
03 Fachbereich Humanwissenschaften > Institut für Psychologie
Hinterlegungsdatum: 06 Nov 2023 13:53
Letzte Änderung: 07 Nov 2023 12:04
PPN: 512974039
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen