Huang Haoxin, Shi Shuhui, Zha Jiajia, Xia Yunpeng, Wang Huide, Yang Peng, Zheng Long, Xu Songcen, Wang Wei, Ren Yi, Wang Yongji, Chen Ye, Chan Hau Ping, Ho Johnny C, Chai Yang, Wang Zhongrui, Tan Chaoliang
Department of Electrical Engineering, City University of Hong Kong, Hong Kong SAR, China.
Department of Electrical and Electronic Engineering, University of Hong Kong, Hong Kong SAR, China.
Nat Commun. 2025 Apr 24;16(1):3836. doi: 10.1038/s41467-025-59104-7.
Efficiently capturing multidimensional signals containing spectral and temporal information is crucial for intelligent machine vision. Although in-sensor computing shows promise for efficient visual processing by reducing data transfer, its capability to compress temporal/spectral data is rarely reported. Here we demonstrate a programmable two-dimensional (2D) heterostructure-based optoelectronic sensor integrating sensing, memory, and computation for in-sensor data compression. Our 2D sensor captured and memorized/encoded optical signals, leading to in-device snapshot compression of dynamic videos and three-dimensional spectral data with a compression ratio of 8:1. The reconstruction quality, indicated by a peak signal-to-noise ratio value of 15.81 dB, is comparable to the 16.21 dB achieved through software. Meanwhile, the compressed action videos (in the form of 2D images) preserve all semantic information and can be accurately classified using in-sensor convolution without decompression, achieving accuracy on par with uncompressed videos (93.18% vs 83.43%). Our 2D optoelectronic sensors promote the development of efficient intelligent vision systems at the edge.
有效捕获包含光谱和时间信息的多维信号对于智能机器视觉至关重要。尽管传感器内计算通过减少数据传输在高效视觉处理方面显示出前景,但其压缩时间/光谱数据的能力鲜有报道。在此,我们展示了一种基于可编程二维(2D)异质结构的光电传感器,该传感器集成了传感、存储和计算功能,用于传感器内数据压缩。我们的二维传感器捕获并存储/编码光信号,从而实现对动态视频和三维光谱数据的片上快照压缩,压缩比为8:1。以15.81 dB的峰值信噪比表示的重建质量与通过软件实现的16.21 dB相当。同时,压缩后的动作视频(以二维图像形式)保留了所有语义信息,并且可以在不解压缩的情况下使用传感器内卷积进行准确分类,准确率与未压缩视频相当(分别为93.18%和83.43%)。我们的二维光电传感器推动了边缘高效智能视觉系统的发展。