用于神经网络的可转移多色光学编码器。

Transferable polychromatic optical encoder for neural networks.

作者信息

Choi Minho, Xiang Jinlin, Wirth-Singh Anna, Baek Seung-Hwan, Shlizerman Eli, Majumdar Arka

机构信息

Department of Electrical and Computer Engineering, University of Washington, Seattle, 98103, WA, USA.

Department of Physics, University of Washington, Seattle, 98103, WA, USA.

出版信息

Nat Commun. 2025 Jul 1;16(1):5623. doi: 10.1038/s41467-025-61338-4.

DOI:10.1038/s41467-025-61338-4

PMID:40593885

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12215009/

Abstract

Artificial neural networks have fundamentally transformed the field of computer vision, providing unprecedented performance. However, these neural networks for image processing demand substantial computational resources, often hindering real-time operation. In this work, we demonstrate an optical encoder that can perform convolution simultaneously in three color channels during the image capture, effectively implementing several initial convolutional layers of the network. Such an optical encoding results in ~ 24, 000 × reduction in computational operations, with a state-of-the-art classification accuracy (~73.2%) in free-space optical system. In addition, our analog optical encoder, trained for CIFAR-10 data, can be transferred to the ImageNet subset, High-10, without any modifications, and still exhibits moderate accuracy. The proposed method can decrease total system-level energy more than two orders of magnitude per a single object classification. Our results evidence the potential of hybrid optical/digital computer vision system in which the optical frontend can pre-process an ambient scene to reduce the energy and latency of the whole computer vision system.

摘要

人工神经网络从根本上改变了计算机视觉领域，提供了前所未有的性能。然而，这些用于图像处理的神经网络需要大量的计算资源，这常常阻碍实时操作。在这项工作中，我们展示了一种光学编码器，它可以在图像捕获期间在三个颜色通道中同时执行卷积，有效地实现了网络的几个初始卷积层。这种光学编码使计算操作减少了约24000倍，在自由空间光学系统中具有先进的分类精度（约73.2%）。此外，我们针对CIFAR-10数据训练的模拟光学编码器可以在不做任何修改的情况下转移到ImageNet子集High-10上，并且仍然具有中等精度。所提出的方法每进行一次单个对象分类，可将整个系统级能量降低两个以上数量级。我们的结果证明了混合光学/数字计算机视觉系统的潜力，其中光学前端可以对周围场景进行预处理，以降低整个计算机视觉系统的能量和延迟。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cea/12215009/e66fee81b3fa/41467_2025_61338_Fig1_HTML.jpg

相似文献

Transferable polychromatic optical encoder for neural networks.用于神经网络的可转移多色光学编码器。

Nat Commun. 2025 Jul 1;16(1):5623. doi: 10.1038/s41467-025-61338-4.

Reading aids for adults with low vision.针对视力低下成年人的阅读辅助工具。

Cochrane Database Syst Rev. 2018 Apr 17;4(4):CD003303. doi: 10.1002/14651858.CD003303.pub4.

Antidepressants for pain management in adults with chronic pain: a network meta-analysis.抗抑郁药治疗成人慢性疼痛的疼痛管理：一项网络荟萃分析。

Health Technol Assess. 2024 Oct;28(62):1-155. doi: 10.3310/MKRT2948.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Exploring the Potential of Electroencephalography Signal-Based Image Generation Using Diffusion Models: Integrative Framework Combining Mixed Methods and Multimodal Analysis.利用扩散模型探索基于脑电图信号的图像生成潜力：结合混合方法和多模态分析的综合框架

JMIR Med Inform. 2025 Jun 25;13:e72027. doi: 10.2196/72027.

Automatic melanoma detection using an optimized five-stream convolutional neural network.使用优化的五流卷积神经网络进行黑色素瘤自动检测。

Sci Rep. 2025 Jul 1;15(1):22404. doi: 10.1038/s41598-025-05675-w.

Perioperative medications for preventing temporarily increased intraocular pressure after laser trabeculoplasty.用于预防激光小梁成形术后眼压暂时升高的围手术期药物。

Cochrane Database Syst Rev. 2017 Feb 23;2(2):CD010746. doi: 10.1002/14651858.CD010746.pub2.

Interventions for implementation of thromboprophylaxis in hospitalized patients at risk for venous thromboembolism.对有静脉血栓栓塞风险的住院患者实施血栓预防的干预措施。

Cochrane Database Syst Rev. 2018 Apr 24;4(4):CD008201. doi: 10.1002/14651858.CD008201.pub3.

Systemic pharmacological treatments for chronic plaque psoriasis: a network meta-analysis.系统性药理学治疗慢性斑块状银屑病：网络荟萃分析。

Cochrane Database Syst Rev. 2021 Apr 19;4(4):CD011535. doi: 10.1002/14651858.CD011535.pub4.

Drugs for preventing postoperative nausea and vomiting in adults after general anaesthesia: a network meta-analysis.成人全身麻醉后预防术后恶心呕吐的药物：网状Meta分析

Cochrane Database Syst Rev. 2020 Oct 19;10(10):CD012859. doi: 10.1002/14651858.CD012859.pub2.

本文引用的文献

Beating spectral bandwidth limits for large aperture broadband nano-optics.突破大孔径宽带纳米光学的光谱带宽限制。

Nat Commun. 2025 Mar 28;16(1):3025. doi: 10.1038/s41467-025-58208-4.

Photonic advantage of optical encoders.光学编码器的光子优势。

Nanophotonics. 2023 Nov 16;13(7):1191-1196. doi: 10.1515/nanoph-2023-0579. eCollection 2024 Mar.

Spatially varying nanophotonic neural networks.空间可变纳米光子神经网络。

Sci Adv. 2024 Nov 8;10(45):eadp0391. doi: 10.1126/sciadv.adp0391.

Nonlinear optical encoding enabled by recurrent linear scattering.由循环线性散射实现的非线性光学编码

Nat Photonics. 2024;18(10):1067-1075. doi: 10.1038/s41566-024-01493-0. Epub 2024 Jul 31.

Fully forward mode training for optical neural networks.全前向模式训练的光神经网络。

Nature. 2024 Aug;632(8024):280-286. doi: 10.1038/s41586-024-07687-4. Epub 2024 Aug 7.

Using scalable computer vision to automate high-throughput semiconductor characterization.利用可扩展的计算机视觉实现高通量半导体表征自动化。

Nat Commun. 2024 Jun 11;15(1):4654. doi: 10.1038/s41467-024-48768-2.

Low-latency automotive vision with event cameras.具有事件相机的低延迟汽车视觉。

Nature. 2024 May;629(8014):1034-1040. doi: 10.1038/s41586-024-07409-w. Epub 2024 May 29.

A vision chip with complementary pathways for open-world sensing.具有开放式感测互补通路的视觉芯片。

Nature. 2024 May;629(8014):1027-1033. doi: 10.1038/s41586-024-07358-4. Epub 2024 May 29.

Integrated photonic encoder for low power and high-speed image processing.用于低功耗和高速图像处理的集成光子编码器。

Nat Commun. 2024 May 27;15(1):4510. doi: 10.1038/s41467-024-48099-2.

Large-scale photonic chiplet Taichi empowers 160-TOPS/W artificial general intelligence.大规模光子小芯片“太元”赋能160万亿次/瓦的通用人工智能。

Science. 2024 Apr 12;384(6692):202-209. doi: 10.1126/science.adl1203. Epub 2024 Apr 11.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于神经网络的可转移多色光学编码器。

Transferable polychromatic optical encoder for neural networks.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献