Ramakrishnan Srinivasan A, Shaik Riaz B, Kanagamani Tamizharasan, Neppala Gopi, Chen Jeffrey, Fiore Vincenzo G, Hammond Christopher J, Srinivasan Shankar, Ivanov Iliyan, Chakravarthy V Srinivasa, Kool Wouter, Parvaz Muhammad A
Department of Health Informatics, Rutgers - School of Health Professions, Piscataway, NJ USA.
Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY USA.
NPP Digit Psychiatry Neurosci. 2025;3(1):1. doi: 10.1038/s44277-024-00023-8. Epub 2025 Jan 3.
Reinforcement learning studies propose that decision-making is guided by a tradeoff between computationally cheaper model-free (habitual) control and costly model-based (goal-directed) control. Greater model-based control is typically used under highly rewarding conditions to minimize risk and maximize gain. Although prior studies have shown impairments in sensitivity to reward value in individuals with frequent alcohol use, it is unclear how these individuals arbitrate between model-free and model-based control based on the magnitude of reward incentives. In this study, 81 individuals (47 frequent Alcohol Users and 34 Alcohol Non-Users) performed a modified 2-step learning task where stakes were sometimes high, and other times they were low. Maximum fitting of a dual-system reinforcement-learning model was used to assess the degree of model-based control, and a utility model was used to assess risk sensitivity for the low- and high-stakes trials separately. As expected, Alcohol Non-Users showed significantly higher model-based control in higher compared to lower reward conditions, whereas no such difference between the two conditions was observed for the Alcohol Users. Additionally, both groups were significantly less risk-averse in higher compared to lower reward conditions. However, Alcohol Users were significantly less risk-averse compared to Alcohol Non-Users in the higher reward condition. Lastly, greater model-based control was associated with a less risk-sensitive approach in Alcohol Users. Taken together, these results suggest that frequent Alcohol Users may have impaired metacontrol, making them less flexible to varying monetary rewards and more prone to risky decision-making, especially when the stakes are high.
强化学习研究表明,决策是由计算成本较低的无模型(习惯性)控制和成本较高的基于模型(目标导向)控制之间的权衡所引导的。在高回报条件下,通常会更多地使用基于模型的控制,以将风险降至最低并使收益最大化。尽管先前的研究表明,频繁饮酒的个体对奖励价值的敏感性存在损害,但尚不清楚这些个体如何根据奖励激励的大小在无模型控制和基于模型的控制之间进行权衡。在这项研究中,81名个体(47名频繁饮酒者和34名非饮酒者)完成了一项经过修改的两步学习任务,其中赌注有时高,有时低。使用双系统强化学习模型的最大拟合来评估基于模型的控制程度,并使用效用模型分别评估低赌注和高赌注试验的风险敏感性。正如预期的那样,与较低奖励条件相比,非饮酒者在较高奖励条件下表现出明显更高的基于模型的控制,而饮酒者在这两种条件之间未观察到这种差异。此外,与较低奖励条件相比,两组在较高奖励条件下的风险厌恶程度均显著降低。然而,在较高奖励条件下,饮酒者的风险厌恶程度明显低于非饮酒者。最后,在饮酒者中,更高的基于模型的控制与较低的风险敏感方法相关。综上所述,这些结果表明,频繁饮酒者可能存在元控制受损的情况,这使得他们在面对不同的金钱奖励时灵活性降低,更容易做出冒险决策,尤其是在赌注很高的时候。