Suppr超能文献

一种用于先进机器视觉的多通道光学计算架构。

A multichannel optical computing architecture for advanced machine vision.

作者信息

Xu Zhihao, Yuan Xiaoyun, Zhou Tiankuang, Fang Lu

机构信息

Sigma Laboratory, Department of Electronic Engineering, Tsinghua University, Beijing, China.

Beijing National Research Center for Information Science and Technology (BNRist), Beijing, China.

出版信息

Light Sci Appl. 2022 Aug 18;11(1):255. doi: 10.1038/s41377-022-00945-y.

Abstract

Endowed with the superior computing speed and energy efficiency, optical neural networks (ONNs) have attracted ever-growing attention in recent years. Existing optical computing architectures are mainly single-channel due to the lack of advanced optical connection and interaction operators, solving simple tasks such as hand-written digit classification, saliency detection, etc. The limited computing capacity and scalability of single-channel ONNs restrict the optical implementation of advanced machine vision. Herein, we develop Monet: a multichannel optical neural network architecture for a universal multiple-input multiple-channel optical computing based on a novel projection-interference-prediction framework where the inter- and intra- channel connections are mapped to optical interference and diffraction. In our Monet, optical interference patterns are generated by projecting and interfering the multichannel inputs in a shared domain. These patterns encoding the correspondences together with feature embeddings are iteratively produced through the projection-interference process to predict the final output optically. For the first time, Monet validates that multichannel processing properties can be optically implemented with high-efficiency, enabling real-world intelligent multichannel-processing tasks solved via optical computing, including 3D/motion detections. Extensive experiments on different scenarios demonstrate the effectiveness of Monet in handling advanced machine vision tasks with comparative accuracy as the electronic counterparts yet achieving a ten-fold improvement in computing efficiency. For intelligent computing, the trends of dealing with real-world advanced tasks are irreversible. Breaking the capacity and scalability limitations of single-channel ONN and further exploring the multichannel processing potential of wave optics, we anticipate that the proposed technique will accelerate the development of more powerful optical AI as critical support for modern advanced machine vision.

摘要

光学神经网络(ONNs)凭借其卓越的计算速度和能源效率,近年来受到了越来越多的关注。由于缺乏先进的光学连接和交互算子,现有的光学计算架构主要是单通道的,只能解决诸如手写数字分类、显著性检测等简单任务。单通道光学神经网络有限的计算能力和可扩展性限制了先进机器视觉的光学实现。在此,我们开发了Monet:一种基于新颖的投影-干涉-预测框架的多通道光学神经网络架构,用于通用的多输入多通道光学计算,其中通道间和通道内的连接被映射到光学干涉和衍射。在我们的Monet中,通过在共享域中投影和干涉多通道输入来生成光学干涉图案。这些编码对应关系的图案与特征嵌入一起通过投影-干涉过程迭代生成,以光学方式预测最终输出。Monet首次验证了多通道处理特性可以高效地以光学方式实现,从而能够通过光学计算解决包括3D/运动检测在内的现实世界智能多通道处理任务。在不同场景下的大量实验表明,Monet在处理先进机器视觉任务方面具有有效性,其精度与电子对应物相当,但计算效率提高了十倍。对于智能计算而言,处理现实世界先进任务的趋势是不可逆转的。突破单通道光学神经网络的容量和可扩展性限制,并进一步探索波动光学的多通道处理潜力,我们预计所提出的技术将加速更强大的光学人工智能的发展,为现代先进机器视觉提供关键支持。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fa15/9385649/372f93741d8f/41377_2022_945_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验