Suppr超能文献

作为中微子实验计算服务的GPU加速机器学习推理

GPU-Accelerated Machine Learning Inference as a Service for Computing in Neutrino Experiments.

作者信息

Wang Michael, Yang Tingjun, Flechas Maria Acosta, Harris Philip, Hawks Benjamin, Holzman Burt, Knoepfel Kyle, Krupa Jeffrey, Pedro Kevin, Tran Nhan

机构信息

Fermi National Accelerator Laboratory, Batavia, IL, United States.

Massachusetts Institute of Technology, Cambridge, MA, United States.

出版信息

Front Big Data. 2021 Jan 14;3:604083. doi: 10.3389/fdata.2020.604083. eCollection 2020.

Abstract

Machine learning algorithms are becoming increasingly prevalent and performant in the reconstruction of events in accelerator-based neutrino experiments. These sophisticated algorithms can be computationally expensive. At the same time, the data volumes of such experiments are rapidly increasing. The demand to process billions of neutrino events with many machine learning algorithm inferences creates a computing challenge. We explore a computing model in which heterogeneous computing with GPU coprocessors is made available as a web service. The coprocessors can be efficiently and elastically deployed to provide the right amount of computing for a given processing task. With our approach, Services for Optimized Network Inference on Coprocessors (SONIC), we integrate GPU acceleration specifically for the ProtoDUNE-SP reconstruction chain without disrupting the native computing workflow. With our integrated framework, we accelerate the most time-consuming task, track and particle shower hit identification, by a factor of 17. This results in a factor of 2.7 reduction in the total processing time when compared with CPU-only production. For this particular task, only 1 GPU is required for every 68 CPU threads, providing a cost-effective solution.

摘要

机器学习算法在基于加速器的中微子实验的事件重建中越来越普遍且性能出色。这些复杂的算法计算成本可能很高。与此同时,此类实验的数据量正在迅速增加。使用多种机器学习算法推理来处理数十亿中微子事件的需求带来了计算挑战。我们探索了一种计算模型,其中带有GPU协处理器的异构计算作为一种网络服务提供。协处理器可以高效且灵活地部署,为给定的处理任务提供适量的计算能力。通过我们的方法,即协处理器优化网络推理服务(SONIC),我们专门为ProtoDUNE-SP重建链集成了GPU加速,同时不干扰原生计算工作流程。通过我们的集成框架,我们将最耗时的任务,即轨迹和粒子簇射命中识别,加速了17倍。与仅使用CPU的生产方式相比,这使得总处理时间减少了2.7倍。对于这个特定任务,每68个CPU线程仅需1个GPU,提供了一种经济高效的解决方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/7931905/f7d283bfacf8/fdata-03-604083-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验