Ou Shaoyuan, Xue Kaiwen, Zhou Lian, Lee Chun-Ho, Sludds Alexander, Hamerly Ryan, Zhang Ke, Feng Hanke, Yu Yue, Kopparapu Reshma, Zhong Eric, Wang Cheng, Englund Dirk, Yu Mengjie, Chen Zaijun
Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California, Los Angeles, CA 90089, USA.
Department of Electrical Engineering and Computer Science, University of California, Berkeley, CA 94720, USA.
Sci Adv. 2025 Jun 6;11(23):eadu0228. doi: 10.1126/sciadv.adu0228. Epub 2025 Jun 4.
The escalating data volume and complexity resulting from the rapid expansion of artificial intelligence (AI), Internet of Things (IoT), and 5G/6G mobile networks is creating an urgent need for energy-efficient, scalable computing hardware. Here, we demonstrate a hypermultiplexed tensor optical processor that can perform trillions of operations per second using space-time-wavelength three-dimensional optical parallelism, enabling O(N) operations per clock cycle with O(N) modulator devices. The system is built with wafer-fabricated III/V micrometer-scale lasers and high-speed thin-film lithium niobate electro-optics for encoding at tens of femtojoules per symbol. Lasing threshold incorporates analog inline rectifier (ReLU) nonlinearity for low-latency activation. The system scalability is verified with machine learning models of 405,000 parameters. A combination of high clock rates, energy-efficient processing, and programmability unlocks the potential of light for low-energy AI accelerators for applications ranging from training of large AI models to real-time decision-making in edge deployment.
人工智能(AI)、物联网(IoT)和5G/6G移动网络的迅速扩张导致数据量不断增加且复杂性不断升级,这迫切需要节能、可扩展的计算硬件。在此,我们展示了一种超复用张量光学处理器,它可以利用时空波长三维光学并行性每秒执行数万亿次运算,通过O(N)个调制器设备实现每个时钟周期O(N)次运算。该系统由晶圆制造的III/V族微米级激光器和高速薄膜铌酸锂电光器件构建而成,用于以每个符号数十飞焦的能量进行编码。激光阈值结合了模拟串联整流器(ReLU)非线性以实现低延迟激活。通过具有405,000个参数的机器学习模型验证了该系统的可扩展性。高时钟速率、节能处理和可编程性的结合,为低能耗AI加速器释放了光的潜力,适用于从大型AI模型训练到边缘部署中的实时决策等各种应用。