• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于强化学习的可重构双WSe晶体管突触单元。

A Reconfigurable Two-WSe -Transistor Synaptic Cell for Reinforcement Learning.

作者信息

Zhou Yue, Wang Yasai, Zhuge Fuwei, Guo Jianmiao, Ma Sijie, Wang Jingli, Tang Zijian, Li Yi, Miao Xiangshui, He Yuhui, Chai Yang

机构信息

Wuhan National Laboratory for Optoelectronics, School of Integrated Circuits, Huazhong University of Science and Technology, Wuhan, 430000, China.

Department of Applied Physics, The Hong Kong Polytechnic University, Hong Kong, 999077, China.

出版信息

Adv Mater. 2022 Dec;34(48):e2107754. doi: 10.1002/adma.202107754. Epub 2022 Feb 25.

DOI:10.1002/adma.202107754
PMID:35104378
Abstract

Reward-modulated spike-timing-dependent plasticity (R-STDP) is a brain-inspired reinforcement learning (RL) rule, exhibiting potential for decision-making tasks and artificial general intelligence. However, the hardware implementation of the reward-modulation process in R-STDP usually requires complicated Si complementary metal-oxide-semiconductor (CMOS) circuit design that causes high power consumption and large footprint. Here, a design with two synaptic transistors (2T) connected in a parallel structure is experimentally demonstrated. The 2T unit based on WSe ferroelectric transistors exhibits reconfigurable polarity behavior, where one channel can be tuned as n-type and the other as p-type due to nonvolatile ferroelectric polarization. In this way, opposite synaptic weight update behaviors with multilevel (>6 bit) conductance states, ultralow nonlinearity (0.56/-1.23), and large G /G ratio of 30 are realized. By applying positive/negative reward to (anti-)STDP component of 2T cell, R-STDP learning rules are realized for training the spiking neural network and demonstrated to solve the classical cart-pole problem, exhibiting a way for realizing low-power (32 pJ per forward process) and highly area-efficient (100 µm ) hardware chip for reinforcement learning.

摘要

奖励调制的尖峰时间依赖可塑性(R-STDP)是一种受大脑启发的强化学习(RL)规则,在决策任务和通用人工智能方面展现出潜力。然而,R-STDP中奖励调制过程的硬件实现通常需要复杂的硅互补金属氧化物半导体(CMOS)电路设计,这会导致高功耗和大尺寸。在此,实验展示了一种由两个以并联结构连接的突触晶体管(2T)组成的设计。基于WSe铁电晶体管的2T单元表现出可重构的极性行为,由于非易失性铁电极化,其中一个通道可被调制成n型,另一个通道可被调制成p型。通过这种方式,实现了具有多级(>6位)电导状态、超低非线性(0.56/-1.23)以及30的大G /G比的相反突触权重更新行为。通过对2T单元的(反)STDP组件施加正/负奖励,实现了用于训练脉冲神经网络的R-STDP学习规则,并证明其能解决经典的推车摆杆问题,展示了一种实现用于强化学习的低功耗(每次前向过程32 pJ)和高面积效率(100 µm )硬件芯片的方法。

相似文献

1
A Reconfigurable Two-WSe -Transistor Synaptic Cell for Reinforcement Learning.一种用于强化学习的可重构双WSe晶体管突触单元。
Adv Mater. 2022 Dec;34(48):e2107754. doi: 10.1002/adma.202107754. Epub 2022 Feb 25.
2
Ferroelectric Polarized in Transistor Channel Polarity Modulation for Reward-Modulated Spike-Time-Dependent Plasticity Application.晶体管沟道极性调制中的铁电极化用于奖励调制的尖峰时间依赖可塑性应用。
J Phys Chem Lett. 2022 Nov 3;13(43):10056-10064. doi: 10.1021/acs.jpclett.2c03007. Epub 2022 Oct 20.
3
Reinforcement learning through modulation of spike-timing-dependent synaptic plasticity.通过调节尖峰时间依赖性突触可塑性进行强化学习。
Neural Comput. 2007 Jun;19(6):1468-502. doi: 10.1162/neco.2007.19.6.1468.
4
A forecast-based STDP rule suitable for neuromorphic implementation.一种适用于神经形态实现的基于预测的 STDP 规则。
Neural Netw. 2012 Aug;32:3-14. doi: 10.1016/j.neunet.2012.02.018. Epub 2012 Feb 14.
5
A learning theory for reward-modulated spike-timing-dependent plasticity with application to biofeedback.一种用于奖励调制的依赖于尖峰时间的可塑性的学习理论及其在生物反馈中的应用。
PLoS Comput Biol. 2008 Oct;4(10):e1000180. doi: 10.1371/journal.pcbi.1000180. Epub 2008 Oct 10.
6
A spiking network model of decision making employing rewarded STDP.一种采用奖励性尖峰时间依赖可塑性的决策尖峰网络模型。
PLoS One. 2014 Mar 14;9(3):e90821. doi: 10.1371/journal.pone.0090821. eCollection 2014.
7
A scalable neural chip with synaptic electronics using CMOS integrated memristors.一种使用 CMOS 集成忆阻器的具有突触电子学的可扩展神经芯片。
Nanotechnology. 2013 Sep 27;24(38):384011. doi: 10.1088/0957-4484/24/38/384011. Epub 2013 Sep 2.
8
An implementation of reinforcement learning based on spike timing dependent plasticity.一种基于脉冲时间依赖可塑性的强化学习实现。
Biol Cybern. 2008 Dec;99(6):517-23. doi: 10.1007/s00422-008-0265-6. Epub 2008 Oct 22.
9
First-Spike-Based Visual Categorization Using Reward-Modulated STDP.基于首次放电的视觉分类,使用奖励调制的 STDP。
IEEE Trans Neural Netw Learn Syst. 2018 Dec;29(12):6178-6190. doi: 10.1109/TNNLS.2018.2826721. Epub 2018 May 8.
10
Ferroelectric 2D SnS Analog Synaptic FET.铁电二维硫化锡模拟突触场效应晶体管。
Adv Sci (Weinh). 2024 Apr;11(16):e2308588. doi: 10.1002/advs.202308588. Epub 2024 Feb 20.

引用本文的文献

1
8-bit states in 2D floating-gate memories using gate-injection mode for large-scale convolutional neural networks.用于大规模卷积神经网络的采用栅极注入模式的二维浮栅存储器中的8位状态
Nat Commun. 2025 Mar 18;16(1):2649. doi: 10.1038/s41467-025-58005-z.
2
Refreshable memristor via dynamic allocation of ferro-ionic phase for neural reuse.通过铁离子相的动态分配实现可刷新忆阻器用于神经复用。
Nat Commun. 2025 Jan 15;16(1):702. doi: 10.1038/s41467-024-55701-0.
3
Multifunctional 2D FETs exploiting incipient ferroelectricity in freestanding SrTiO nanomembranes at sub-ambient temperatures.
在低于环境温度下利用自支撑SrTiO纳米膜中的初始铁电性的多功能二维场效应晶体管。
Nat Commun. 2024 Dec 30;15(1):10739. doi: 10.1038/s41467-024-54231-z.
4
High-Performance Gate-Voltage-Tunable Photodiodes Based on NbPdSe/WSe Mixed-Dimensional Heterojunctions.基于NbPdSe/WSe混合维度异质结的高性能栅极电压可调谐光电二极管。
ACS Appl Mater Interfaces. 2024 Nov 20;16(46):63713-63722. doi: 10.1021/acsami.4c09682. Epub 2024 Nov 5.
5
Two-Dimensional MoS-Based Anisotropic Synaptic Transistor for Neuromorphic Computing by Localized Electron Beam Irradiation.用于神经形态计算的基于二维MoS的各向异性突触晶体管:通过局部电子束辐照实现
Adv Sci (Weinh). 2024 Dec;11(45):e2408210. doi: 10.1002/advs.202408210. Epub 2024 Oct 16.
6
Emerging 2D Ferroelectric Devices for In-Sensor and In-Memory Computing.用于传感器内和内存内计算的新兴二维铁电器件
Adv Mater. 2025 Jan;37(2):e2400332. doi: 10.1002/adma.202400332. Epub 2024 May 20.
7
A multi-timescale synaptic weight based on ferroelectric hafnium zirconium oxide.一种基于铁电铪锆氧化物的多时间尺度突触权重。
Commun Mater. 2023;4(1):14. doi: 10.1038/s43246-023-00342-x. Epub 2023 Feb 17.
8
Room-temperature valley transistors for low-power neuromorphic computing.室温谷晶体管用于低功耗神经形态计算。
Nat Commun. 2022 Dec 15;13(1):7758. doi: 10.1038/s41467-022-35396-x.