Suppr超能文献

使用 FPGA 加速神经网络进行医疗诊断的实时数据分析。

Real-time data analysis for medical diagnosis using FPGA-accelerated neural networks.

机构信息

Computer Architecture and Automated Design Lab, Boston University, Boston, MA, USA.

Argonne Leadership Computing Facility, Argonne National Laboratory, Lemont, IL, USA.

出版信息

BMC Bioinformatics. 2018 Dec 21;19(Suppl 18):490. doi: 10.1186/s12859-018-2505-7.

Abstract

BACKGROUND

Real-time analysis of patient data during medical procedures can provide vital diagnostic feedback that significantly improves chances of success. With sensors becoming increasingly fast, frameworks such as Deep Neural Networks are required to perform calculations within the strict timing constraints for real-time operation. However, traditional computing platforms responsible for running these algorithms incur a large overhead due to communication protocols, memory accesses, and static (often generic) architectures. In this work, we implement a low-latency Multi-Layer Perceptron (MLP) processor using Field Programmable Gate Arrays (FPGAs). Unlike CPUs and Graphics Processing Units (GPUs), our FPGA-based design can directly interface sensors, storage devices, display devices and even actuators, thus reducing the delays of data movement between ports and compute pipelines. Moreover, the compute pipelines themselves are tailored specifically to the application, improving resource utilization and reducing idle cycles. We demonstrate the effectiveness of our approach using mass-spectrometry data sets for real-time cancer detection.

RESULTS

We demonstrate that correct parameter sizing, based on the application, can reduce latency by 20% on average. Furthermore, we show that in an application with tightly coupled data-path and latency constraints, having a large amount of computing resources can actually reduce performance. Using mass-spectrometry benchmarks, we show that our proposed FPGA design outperforms both CPU and GPU implementations, with an average speedup of 144x and 21x, respectively.

CONCLUSION

In our work, we demonstrate the importance of application-specific optimizations in order to minimize latency and maximize resource utilization for MLP inference. By directly interfacing and processing sensor data with ultra-low latency, FPGAs can perform real-time analysis during procedures and provide diagnostic feedback that can be critical to achieving higher percentages of successful patient outcomes.

摘要

背景

在医疗过程中实时分析患者数据可以提供重要的诊断反馈,显著提高成功率。随着传感器变得越来越快,需要诸如深度神经网络之类的框架在实时操作的严格时间限制内执行计算。然而,负责运行这些算法的传统计算平台由于通信协议、内存访问和静态(通常是通用)架构而产生很大的开销。在这项工作中,我们使用现场可编程门阵列(FPGA)实现了低延迟多层感知器(MLP)处理器。与 CPU 和图形处理单元(GPU)不同,我们的基于 FPGA 的设计可以直接与传感器、存储设备、显示设备甚至执行器接口,从而减少端口和计算管道之间的数据移动延迟。此外,计算管道本身专门针对应用程序进行了定制,从而提高了资源利用率并减少了空闲周期。我们使用实时癌症检测的质谱数据集证明了我们方法的有效性。

结果

我们证明了基于应用程序的正确参数调整可以平均减少 20%的延迟。此外,我们还表明,在具有紧密耦合的数据路径和延迟约束的应用程序中,拥有大量计算资源实际上可能会降低性能。使用质谱基准测试,我们表明我们提出的 FPGA 设计分别比 CPU 和 GPU 实现平均快 144 倍和 21 倍。

结论

在我们的工作中,我们证明了针对 MLP 推理最小化延迟和最大化资源利用率的特定于应用程序的优化的重要性。通过直接接口和超低速处理传感器数据,FPGA 可以在手术过程中进行实时分析,并提供诊断反馈,这对实现更高比例的患者成功结果至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e1f/6302367/e64fcf050d6d/12859_2018_2505_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验