Integrated Systems Biology Laboratory, Department of Systems Science, Graduate School of Informatics, Kyoto University, Sakyo-ku, Kyoto, Japan.
Laboratory of Structural Physiology, Center for Disease Biology and Integrative Medicine, Faculty of Medicine, University of Tokyo, Bunkyo-ku, Tokyo, Japan.
PLoS Comput Biol. 2020 Jul 23;16(7):e1008078. doi: 10.1371/journal.pcbi.1008078. eCollection 2020 Jul.
Animals remember temporal links between their actions and subsequent rewards. We previously discovered a synaptic mechanism underlying such reward learning in D1 receptor (D1R)-expressing spiny projection neurons (D1 SPN) of the striatum. Dopamine (DA) bursts promote dendritic spine enlargement in a time window of only a few seconds after paired pre- and post-synaptic spiking (pre-post pairing), which is termed as reinforcement plasticity (RP). The previous study has also identified underlying signaling pathways; however, it still remains unclear how the signaling dynamics results in RP. In the present study, we first developed a computational model of signaling dynamics of D1 SPNs. The D1 RP model successfully reproduced experimentally observed protein kinase A (PKA) activity, including its critical time window. In this model, adenylate cyclase type 1 (AC1) in the spines/thin dendrites played a pivotal role as a coincidence detector against pre-post pairing and DA burst. In particular, pre-post pairing (Ca2+ signal) stimulated AC1 with a delay, and the Ca2+-stimulated AC1 was activated by the DA burst for the asymmetric time window. Moreover, the smallness of the spines/thin dendrites is crucial to the short time window for the PKA activity. We then developed a RP model for D2 SPNs, which also predicted the critical time window for RP that depended on the timing of pre-post pairing and phasic DA dip. AC1 worked for the coincidence detector in the D2 RP model as well. We further simulated the signaling pathway leading to Ca2+/calmodulin-dependent protein kinase II (CaMKII) activation and clarified the role of the downstream molecules of AC1 as the integrators that turn transient input signals into persistent spine enlargement. Finally, we discuss how such timing windows guide animals' reward learning.
动物能够记住自身行为与其后续奖励之间的时间关联。我们之前发现纹状体中表达 D1 受体(D1R)的棘突投射神经元(D1 SPN)存在一种突触机制,该机制可解释奖赏学习现象。多巴胺(DA)爆发可促进树突棘在突触前和突触后放电(pre-post pairing)几秒钟后的时间窗内发生增大,该过程被称为强化可塑性(RP)。之前的研究还确定了潜在的信号通路;然而,目前尚不清楚信号转导动力学如何导致 RP。在本研究中,我们首先开发了 D1 SPN 信号转导动力学的计算模型。D1 RP 模型成功再现了实验观察到的蛋白激酶 A(PKA)活性,包括其关键的时间窗。在该模型中,棘突/细树突中的腺苷酸环化酶 1(AC1)作为前-后配对和 DA 爆发的偶联检测器发挥关键作用。特别是,前-后配对(Ca2+信号)以延迟方式刺激 AC1,而 DA 爆发则激活 Ca2+-刺激的 AC1,使其作用于不对称的时间窗。此外,棘突/细树突的体积较小对于 PKA 活性的短时间窗至关重要。然后,我们开发了用于 D2 SPN 的 RP 模型,该模型还预测了依赖于前-后配对和相分离 DA 下降时间的 RP 关键时间窗。AC1 在 D2 RP 模型中也充当偶联检测器。我们进一步模拟了导致 Ca2+/钙调蛋白依赖性蛋白激酶 II(CaMKII)激活的信号通路,并阐明了 AC1 的下游分子作为将瞬态输入信号转化为持续棘突增大的整合器的作用。最后,我们讨论了这些时间窗如何指导动物的奖励学习。