Zhang, Shunan ; Yu, Angela J (2013)
Forgetful Bayes and myopic planning: Human learning and decision-making in a bandit setting.
Twenty-seventh Conference on Neural Information Processing Systems (NIPS 2013). Lake Tahoe, Nevada (05.12.2013-10.12.2013)
Konferenzveröffentlichung, Bibliographie
Kurzbeschreibung (Abstract)
How humans achieve long-term goals in an uncertain environment, via repeated trials and noisy observations, is an important problem in cognitive science. We investigate this behavior in the context of a multi-armed bandit task. We compare human behavior to a variety of models that vary in their representational and computational complexity. Our result shows that subjects' choices, on a trial-to-trial basis, are best captured by a forgetful" Bayesian iterative learning model in combination with a partially myopic decision policy known as Knowledge Gradient. This model accounts for subjects' trial-by-trial choice better than a number of other previously proposed models, including optimal Bayesian learning and risk minimization, epsilon-greedy and win-stay-lose-shift. It has the added benefit of being closest in performance to the optimal Bayesian model than all the other heuristic models that have the same computational complexity (all are significantly less complex than the optimal model). These results constitute an advancement in the theoretical understanding of how humans negotiate the tension between exploration and exploitation in a noisy, imperfectly known environment."
Typ des Eintrags: | Konferenzveröffentlichung |
---|---|
Erschienen: | 2013 |
Autor(en): | Zhang, Shunan ; Yu, Angela J |
Art des Eintrags: | Bibliographie |
Titel: | Forgetful Bayes and myopic planning: Human learning and decision-making in a bandit setting |
Sprache: | Englisch |
Publikationsjahr: | 2013 |
Ort: | Red Hook, NY |
Verlag: | Curran Associates, Inc. |
Buchtitel: | Advances in Neural Information Processing Systems 26 (NIPS 2013) |
Band einer Reihe: | 26 |
Veranstaltungstitel: | Twenty-seventh Conference on Neural Information Processing Systems (NIPS 2013) |
Veranstaltungsort: | Lake Tahoe, Nevada |
Veranstaltungsdatum: | 05.12.2013-10.12.2013 |
URL / URN: | https://proceedings.neurips.cc/paper_files/paper/2013/hash/6... |
Kurzbeschreibung (Abstract): | How humans achieve long-term goals in an uncertain environment, via repeated trials and noisy observations, is an important problem in cognitive science. We investigate this behavior in the context of a multi-armed bandit task. We compare human behavior to a variety of models that vary in their representational and computational complexity. Our result shows that subjects' choices, on a trial-to-trial basis, are best captured by a forgetful" Bayesian iterative learning model in combination with a partially myopic decision policy known as Knowledge Gradient. This model accounts for subjects' trial-by-trial choice better than a number of other previously proposed models, including optimal Bayesian learning and risk minimization, epsilon-greedy and win-stay-lose-shift. It has the added benefit of being closest in performance to the optimal Bayesian model than all the other heuristic models that have the same computational complexity (all are significantly less complex than the optimal model). These results constitute an advancement in the theoretical understanding of how humans negotiate the tension between exploration and exploitation in a noisy, imperfectly known environment." |
Fachbereich(e)/-gebiet(e): | 03 Fachbereich Humanwissenschaften 03 Fachbereich Humanwissenschaften > Institut für Psychologie |
Hinterlegungsdatum: | 31 Okt 2023 06:59 |
Letzte Änderung: | 01 Nov 2023 07:20 |
PPN: | 512781532 |
Export: | |
Suche nach Titel in: | TUfind oder in Google |
Frage zum Eintrag |
Optionen (nur für Redakteure)
Redaktionelle Details anzeigen |