TU Darmstadt / ULB / TUbiblio

Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task

Harlé, Katia M. ; Zhang, Shunan ; Schiff, Max ; Mackey, Scott ; Paulus, Martin P. ; Yu, Angela J. (2015)
Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task.
In: Frontiers in Psychology, 6
doi: 10.3389/fpsyg.2015.01910
Artikel, Bibliographie

Kurzbeschreibung (Abstract)

Understanding how humans weigh long-term and short-term goals is important for both basic cognitive science and clinical neuroscience, as substance users need to balance the appeal of an immediate high vs. the long-term goal of sobriety. We use a computational model to identify learning and decision-making abnormalities in methamphetamine-dependent individuals (MDI, n = 16) vs. healthy control subjects (HCS, n = 16), in a two-armed bandit task. In this task, subjects repeatedly choose between two arms with fixed but unknown reward rates. Each choice not only yields potential immediate reward but also information useful for long-term reward accumulation, thus pitting exploration against exploitation. We formalize the task as comprising a learning component, the updating of estimated reward rates based on ongoing observations, and a decision-making component, the choice among options based on current beliefs and uncertainties about reward rates. We model the learning component as iterative Bayesian inference (the Dynamic Belief Model), and the decision component using five competing decision policies: Win-stay/Lose-shift (WSLS), ε-Greedy, τ-Switch, Softmax, Knowledge Gradient. HCS and MDI significantly differ in how they learn about reward rates and use them to make decisions. HCS learn from past observations but weigh recent data more, and their decision policy is best fit as Softmax. MDI are more likely to follow the simple learning-independent policy of WSLS, and among MDI best fit by Softmax, they have more pessimistic prior beliefs about reward rates and are less likely to choose the option estimated to be most rewarding. Neurally, MDI's tendency to avoid the most rewarding option is associated with a lower gray matter volume of the thalamic dorsal lateral nucleus. More broadly, our work illustrates the ability of our computational framework to help reveal subtle learning and decision-making abnormalities in substance use.

Typ des Eintrags: Artikel
Erschienen: 2015
Autor(en): Harlé, Katia M. ; Zhang, Shunan ; Schiff, Max ; Mackey, Scott ; Paulus, Martin P. ; Yu, Angela J.
Art des Eintrags: Bibliographie
Titel: Altered Statistical Learning and Decision-Making in Methamphetamine Dependence: Evidence from a Two-Armed Bandit Task
Sprache: Englisch
Publikationsjahr: 2015
Ort: Lausanne
Verlag: Frontiers Research Foundation
Titel der Zeitschrift, Zeitung oder Schriftenreihe: Frontiers in Psychology
Jahrgang/Volume einer Zeitschrift: 6
DOI: 10.3389/fpsyg.2015.01910
URL / URN: https://www.frontiersin.org/articles/10.3389/fpsyg.2015.0191...
Kurzbeschreibung (Abstract):

Understanding how humans weigh long-term and short-term goals is important for both basic cognitive science and clinical neuroscience, as substance users need to balance the appeal of an immediate high vs. the long-term goal of sobriety. We use a computational model to identify learning and decision-making abnormalities in methamphetamine-dependent individuals (MDI, n = 16) vs. healthy control subjects (HCS, n = 16), in a two-armed bandit task. In this task, subjects repeatedly choose between two arms with fixed but unknown reward rates. Each choice not only yields potential immediate reward but also information useful for long-term reward accumulation, thus pitting exploration against exploitation. We formalize the task as comprising a learning component, the updating of estimated reward rates based on ongoing observations, and a decision-making component, the choice among options based on current beliefs and uncertainties about reward rates. We model the learning component as iterative Bayesian inference (the Dynamic Belief Model), and the decision component using five competing decision policies: Win-stay/Lose-shift (WSLS), ε-Greedy, τ-Switch, Softmax, Knowledge Gradient. HCS and MDI significantly differ in how they learn about reward rates and use them to make decisions. HCS learn from past observations but weigh recent data more, and their decision policy is best fit as Softmax. MDI are more likely to follow the simple learning-independent policy of WSLS, and among MDI best fit by Softmax, they have more pessimistic prior beliefs about reward rates and are less likely to choose the option estimated to be most rewarding. Neurally, MDI's tendency to avoid the most rewarding option is associated with a lower gray matter volume of the thalamic dorsal lateral nucleus. More broadly, our work illustrates the ability of our computational framework to help reveal subtle learning and decision-making abnormalities in substance use.

Zusätzliche Informationen:

Article 1910

Fachbereich(e)/-gebiet(e): 03 Fachbereich Humanwissenschaften
03 Fachbereich Humanwissenschaften > Institut für Psychologie
Hinterlegungsdatum: 02 Nov 2023 07:48
Letzte Änderung: 03 Nov 2023 13:55
PPN: 512847363
Export:
Suche nach Titel in: TUfind oder in Google
Frage zum Eintrag Frage zum Eintrag

Optionen (nur für Redakteure)
Redaktionelle Details anzeigen Redaktionelle Details anzeigen