• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

面向用于扩散相关光谱中稳健分析的高性能深度学习架构与硬件加速器设计

Towards high-performance deep learning architecture and hardware accelerator design for robust analysis in diffuse correlation spectroscopy.

作者信息

Zang Zhenya, Wang Quan, Pan Mingliang, Zhang Yuanzhe, Chen Xi, Li Xingda, Li David Day Uei

机构信息

Department of Biomedical Engineering, University of Strathclyde, Glasgow, United Kingdom.

Department of Biomedical Engineering, University of Strathclyde, Glasgow, United Kingdom.

出版信息

Comput Methods Programs Biomed. 2025 Jan;258:108471. doi: 10.1016/j.cmpb.2024.108471. Epub 2024 Oct 28.

DOI:10.1016/j.cmpb.2024.108471
PMID:39531806
Abstract

This study proposes a compact deep learning (DL) architecture and a highly parallelized computing hardware platform to reconstruct the blood flow index (BFi) in diffuse correlation spectroscopy (DCS). We leveraged a rigorous analytical model to generate autocorrelation functions (ACFs) to train the DL network. We assessed the accuracy of the proposed DL using simulated and milk phantom data. Compared to convolutional neural networks (CNN), our lightweight DL architecture achieves 66.7% and 18.5% improvement in MSE for BFi and the coherence factor β, using synthetic data evaluation. The accuracy of rBFi over different algorithms was also investigated. We further simplified the DL computing primitives using subtraction for feature extraction, considering further hardware implementation. We extensively explored computing parallelism and fixed-point quantization within the DL architecture. With the DL model's compact size, we employed unrolling and pipelining optimizations for computation-intensive for-loops in the DL model while storing all learned parameters in on-chip BRAMs. We also achieved pixel-wise parallelism, enabling simultaneous, real-time processing of 10 and 15 autocorrelation functions on Zynq-7000 and Zynq-UltraScale+ field programmable gate array (FPGA), respectively. Unlike existing FPGA accelerators that produce BFi and the β from autocorrelation functions on standalone hardware, our approach is an encapsulated, end-to-end on-chip conversion process from intensity photon data to the temporal intensity ACF and subsequently reconstructing BFi and β. This hardware platform achieves an on-chip solution to replace post-processing and miniaturize modern DCS systems that use single-photon cameras. We also comprehensively compared the computational efficiency of our FPGA accelerator to CPU and GPU solutions.

摘要

本研究提出了一种紧凑的深度学习(DL)架构和一个高度并行化的计算硬件平台,用于在扩散相关光谱学(DCS)中重建血流指数(BFi)。我们利用一个严格的分析模型来生成自相关函数(ACF),以训练DL网络。我们使用模拟数据和牛奶仿体数据评估了所提出的DL的准确性。与卷积神经网络(CNN)相比,我们的轻量级DL架构在使用合成数据评估时,BFi的均方误差(MSE)提高了66.7%,相干因子β的MSE提高了18.5%。还研究了不同算法下rBFi的准确性。考虑到进一步的硬件实现,我们使用减法进行特征提取,进一步简化了DL计算原语。我们在DL架构中广泛探索了计算并行性和定点量化。由于DL模型尺寸紧凑,我们对DL模型中计算密集型的for循环采用了展开和流水线优化,同时将所有学习到的参数存储在片上块随机存取存储器(BRAM)中。我们还实现了逐像素并行,分别在Zynq-7000和Zynq-UltraScale+现场可编程门阵列(FPGA)上能够同时实时处理10个和15个自相关函数。与现有的在独立硬件上从自相关函数生成BFi和β的FPGA加速器不同,我们的方法是一个从强度光子数据到时间强度ACF的封装的、端到端的片上转换过程,随后重建BFi和β。这个硬件平台实现了一种片上解决方案,以取代后处理并使使用单光子相机的现代DCS系统小型化。我们还全面比较了我们的FPGA加速器与CPU和GPU解决方案的计算效率。

相似文献

1
Towards high-performance deep learning architecture and hardware accelerator design for robust analysis in diffuse correlation spectroscopy.面向用于扩散相关光谱中稳健分析的高性能深度学习架构与硬件加速器设计
Comput Methods Programs Biomed. 2025 Jan;258:108471. doi: 10.1016/j.cmpb.2024.108471. Epub 2024 Oct 28.
2
Designing Deep Learning Hardware Accelerator and Efficiency Evaluation.深度学习硬件加速器设计与效率评估。
Comput Intell Neurosci. 2022 Jul 13;2022:1291103. doi: 10.1155/2022/1291103. eCollection 2022.
3
A Device-on-Chip Solution for Real-Time Diffuse Correlation Spectroscopy Using FPGA.基于 FPGA 的实时漫散射相关光谱学的片上系统解决方案。
Biosensors (Basel). 2024 Aug 8;14(8):384. doi: 10.3390/bios14080384.
4
Quantification of blood flow index in diffuse correlation spectroscopy using a robust deep learning method.利用稳健的深度学习方法对漫射相关光谱中的血流指数进行定量分析。
J Biomed Opt. 2024 Jan;29(1):015004. doi: 10.1117/1.JBO.29.1.015004. Epub 2024 Jan 27.
5
Acceleration of Deep Neural Network Training Using Field Programmable Gate Arrays.使用现场可编程门阵列加速深度神经网络训练。
Comput Intell Neurosci. 2022 Oct 17;2022:8387364. doi: 10.1155/2022/8387364. eCollection 2022.
6
Accelerating GRAPPA reconstruction using SoC design for real-time cardiac MRI.利用 SoC 设计加速 GRAPPA 重建,实现实时心脏 MRI。
Comput Biol Med. 2023 Jun;160:107008. doi: 10.1016/j.compbiomed.2023.107008. Epub 2023 May 4.
7
Compact and robust deep learning architecture for fluorescence lifetime imaging and FPGA implementation.用于荧光寿命成像的紧凑且稳健的深度学习架构及现场可编程门阵列实现
Methods Appl Fluoresc. 2023 Mar 20;11(2). doi: 10.1088/2050-6120/acc0d9.
8
Custom Hardware Architectures for Deep Learning on Portable Devices: A Review.便携式设备上深度学习的定制硬件架构:综述。
IEEE Trans Neural Netw Learn Syst. 2022 Nov;33(11):6068-6088. doi: 10.1109/TNNLS.2021.3082304. Epub 2022 Oct 27.
9
Customizable FPGA-Based Hardware Accelerator for Standard Convolution Processes Empowered with Quantization Applied to LiDAR Data.可定制的基于 FPGA 的硬件加速器,用于带有量化应用的标准卷积过程,适用于 LiDAR 数据。
Sensors (Basel). 2022 Mar 11;22(6):2184. doi: 10.3390/s22062184.
10
An OpenCL-Based FPGA Accelerator for Faster R-CNN.一种基于OpenCL的用于更快区域卷积神经网络(Faster R-CNN)的现场可编程门阵列(FPGA)加速器。
Entropy (Basel). 2022 Sep 23;24(10):1346. doi: 10.3390/e24101346.

引用本文的文献

1
Fast blood flow index reconstruction of diffuse correlation spectroscopy using a back-propagation-free data-driven algorithm.使用无反向传播的数据驱动算法进行扩散相关光谱的快速血流指数重建。
Biomed Opt Express. 2025 Feb 26;16(3):1254-1269. doi: 10.1364/BOE.549363. eCollection 2025 Mar 1.