• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多核CPU、GPU和MIC系统上的应用性能分析与高效执行:以显微镜图像分析为例

Application Performance Analysis and Efficient Execution on Systems with multi-core CPUs, GPUs and MICs: A Case Study with Microscopy Image Analysis.

作者信息

Teodoro George, Kurc Tahsin, Andrade Guilherme, Kong Jun, Ferreira Renato, Saltz Joel

机构信息

Department of Computer Science, University of Brasília, Brasília, DF, Brazil.

Department of Biomedical Informatics, Stony Brook University, Stony Brook, NY, USA; Scientific Data Group, Oak Ridge National Laboratory, Oak Ridge, TN, USA.

出版信息

Int J High Perform Comput Appl. 2017 Jan;31(1):32-51. doi: 10.1177/1094342015594519. Epub 2015 Jul 27.

DOI:10.1177/1094342015594519
PMID:28239253
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5319667/
Abstract

We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many Integrated Core-MIC) with a microscopy image analysis application. We experimentally evaluate the performance of computing devices on core operations of the application. We correlate the observed performance with the characteristics of computing devices and data access patterns, computation complexities, and parallelization forms of the operations. The results show a significant variability in the performance of operations with respect to the device used. The performances of operations with regular data access are comparable or sometimes better on a MIC than that on a GPU. GPUs are more efficient than MICs for operations that access data irregularly, because of the lower bandwidth of the MIC for random data accesses. We propose new performance-aware scheduling strategies that consider variabilities in operation speedups. Our scheduling strategies significantly improve application performance compared to classic strategies in hybrid configurations.

摘要

我们使用一个显微镜图像分析应用程序对多核CPU、GPU和英特尔至强融核处理器(众核-MIC)进行了对比性能研究。我们通过实验评估了计算设备在该应用程序核心操作上的性能。我们将观察到的性能与计算设备的特性、数据访问模式、计算复杂度以及操作的并行化形式相关联。结果表明,操作性能会因所使用的设备而有显著差异。对于具有规则数据访问的操作,MIC上的性能与GPU相当,有时甚至更好。由于MIC在随机数据访问方面带宽较低,因此对于不规则数据访问的操作,GPU比MIC更高效。我们提出了新的性能感知调度策略,该策略考虑了操作加速比的变化。与混合配置中的经典策略相比,我们的调度策略显著提高了应用程序性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/4221eb467616/nihms704760f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/fe49f36aa5e0/nihms704760f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/6292d0cb90e6/nihms704760f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/f4e89636b9a8/nihms704760f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/0292a2338a33/nihms704760f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/3a0ec8466d2d/nihms704760f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/bcb58beb7229/nihms704760f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/7a95db62829a/nihms704760f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/9aab254fe7e1/nihms704760f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/e306a7be065d/nihms704760f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/44f21c26cab8/nihms704760f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/57776dbbb72e/nihms704760f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/4221eb467616/nihms704760f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/fe49f36aa5e0/nihms704760f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/6292d0cb90e6/nihms704760f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/f4e89636b9a8/nihms704760f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/0292a2338a33/nihms704760f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/3a0ec8466d2d/nihms704760f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/bcb58beb7229/nihms704760f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/7a95db62829a/nihms704760f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/9aab254fe7e1/nihms704760f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/e306a7be065d/nihms704760f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/44f21c26cab8/nihms704760f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/57776dbbb72e/nihms704760f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7bae/5319667/4221eb467616/nihms704760f12.jpg

相似文献

1
Application Performance Analysis and Efficient Execution on Systems with multi-core CPUs, GPUs and MICs: A Case Study with Microscopy Image Analysis.多核CPU、GPU和MIC系统上的应用性能分析与高效执行:以显微镜图像分析为例
Int J High Perform Comput Appl. 2017 Jan;31(1):32-51. doi: 10.1177/1094342015594519. Epub 2015 Jul 27.
2
Comparative Performance Analysis of Intel Xeon Phi, GPU, and CPU: A Case Study from Microscopy Image Analysis.英特尔至强融核处理器、图形处理器和中央处理器的性能对比分析:来自显微镜图像分析的案例研究
IEEE Trans Parallel Distrib Syst. 2014 May;2014:1063-1072. doi: 10.1109/IPDPS.2014.111.
3
Efficient Execution of Microscopy Image Analysis on CPU, GPU, and MIC Equipped Cluster Systems.在配备CPU、GPU和MIC的集群系统上高效执行显微镜图像分析
Proc Symp Comput Archit High Perform Comput. 2014 Oct;2014:89-96. doi: 10.1109/SBAC-PAD.2014.15.
4
Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines.混合CPU-GPU机器上的高效不规则波前传播算法
Parallel Comput. 2013 Apr 1;39(4-5):189-211. doi: 10.1016/j.parco.2013.03.001.
5
Cooperative and out-of-core execution of the irregular wavefront propagation pattern on hybrid machines with Intel Xeon Phi™.在配备英特尔至强融核™的混合机上对不规则波前传播模式进行协同和核外执行。
Concurr Comput. 2018 Jul 25;30(14). doi: 10.1002/cpe.4425. Epub 2018 Jan 24.
6
High-throughput Analysis of Large Microscopy Image Datasets on CPU-GPU Cluster Platforms.在CPU-GPU集群平台上对大型显微镜图像数据集进行高通量分析。
Proc IPDPS (Conf). 2013 May;2013:103-114. doi: 10.1109/IPDPS.2013.11.
7
Accelerating Large Scale Image Analyses on Parallel, CPU-GPU Equipped Systems.在配备了CPU - GPU的并行系统上加速大规模图像分析
Proc IPDPS (Conf). 2012 May;2012:1093-1104. doi: 10.1109/IPDPS.2012.101.
8
Efficient irregular wavefront propagation algorithms on Intel Xeon Phi.英特尔至强融核处理器上的高效不规则波前传播算法
Proc Symp Comput Archit High Perform Comput. 2015 Oct;2015:25-32. doi: 10.1109/SBAC-PAD.2015.13.
9
Comparing performance of many-core CPUs and GPUs for static and motion compensated reconstruction of C-arm CT data.比较多核 CPU 和 GPU 在 C 臂 CT 数据的静态和运动补偿重建中的性能。
Med Phys. 2011 Jan;38(1):468-73. doi: 10.1118/1.3525838.
10
Employing OpenCL to Accelerate Ab Initio Calculations on Graphics Processing Units.利用 OpenCL 加速图形处理单元上的从头算计算。
J Chem Theory Comput. 2017 Jun 13;13(6):2712-2716. doi: 10.1021/acs.jctc.7b00515. Epub 2017 May 31.

引用本文的文献

1
High-throughput screening of the ReFRAME, Pandemic Box, and COVID Box drug repurposing libraries against SARS-CoV-2 nsp15 endoribonuclease to identify small-molecule inhibitors of viral activity.高通量筛选 ReFRAME、Pandemic Box 和 COVID Box 药物再利用文库对 SARS-CoV-2 nsp15 内切核酸酶的活性,以鉴定抑制病毒活性的小分子抑制剂。
PLoS One. 2021 Apr 22;16(4):e0250019. doi: 10.1371/journal.pone.0250019. eCollection 2021.
2
Optimizing parameter sensitivity analysis of large-scale microscopy image analysis workflows with multilevel computation reuse.利用多级计算复用优化大规模显微镜图像分析工作流程的参数敏感性分析。
Concurr Comput. 2020 Jan 25;32(2). doi: 10.1002/cpe.5403. Epub 2019 Jun 24.
3
Parallel and Efficient Sensitivity Analysis of Microscopy Image Segmentation Workflows in Hybrid Systems.混合系统中显微镜图像分割工作流程的并行高效灵敏度分析
Proc IEEE Int Conf Clust Comput. 2017 Sep;2017:25-35. doi: 10.1109/CLUSTER.2017.28. Epub 2017 Sep 26.

本文引用的文献

1
Feature-based Analysis of Large-scale Spatio-Temporal Sensor Data on Hybrid Architectures.混合架构上大规模时空传感器数据的基于特征的分析
Int J High Perform Comput Appl. 2013 Aug;27(3):263-272. doi: 10.1177/1094342013488260. Epub 2013 Jun 9.
2
High-throughput Analysis of Large Microscopy Image Datasets on CPU-GPU Cluster Platforms.在CPU-GPU集群平台上对大型显微镜图像数据集进行高通量分析。
Proc IPDPS (Conf). 2013 May;2013:103-114. doi: 10.1109/IPDPS.2013.11.
3
Accelerating Large Scale Image Analyses on Parallel, CPU-GPU Equipped Systems.在配备了CPU - GPU的并行系统上加速大规模图像分析
Proc IPDPS (Conf). 2012 May;2012:1093-1104. doi: 10.1109/IPDPS.2012.101.
4
Comparative Performance Analysis of Intel Xeon Phi, GPU, and CPU: A Case Study from Microscopy Image Analysis.英特尔至强融核处理器、图形处理器和中央处理器的性能对比分析:来自显微镜图像分析的案例研究
IEEE Trans Parallel Distrib Syst. 2014 May;2014:1063-1072. doi: 10.1109/IPDPS.2014.111.
5
High-Performance Computational Analysis of Glioblastoma Pathology Images with Database Support Identifies Molecular and Survival Correlates.基于数据库支持的胶质母细胞瘤病理图像高性能计算分析确定分子与生存相关性
Proceedings (IEEE Int Conf Bioinformatics Biomed). 2013 Dec:229-236. doi: 10.1109/BIBM.2013.6732495.
6
Machine-based morphologic analysis of glioblastoma using whole-slide pathology images uncovers clinically relevant molecular correlates.基于机器的全切片病理图像对胶质母细胞瘤的形态学分析揭示了具有临床相关性的分子相关性。
PLoS One. 2013 Nov 13;8(11):e81049. doi: 10.1371/journal.pone.0081049. eCollection 2013.
7
Efficient Irregular Wavefront Propagation Algorithms on Hybrid CPU-GPU Machines.混合CPU-GPU机器上的高效不规则波前传播算法
Parallel Comput. 2013 Apr 1;39(4-5):189-211. doi: 10.1016/j.parco.2013.03.001.
8
Integrated morphologic analysis for the identification and characterization of disease subtypes.综合形态分析用于疾病亚型的识别和特征描述。
J Am Med Inform Assoc. 2012 Mar-Apr;19(2):317-23. doi: 10.1136/amiajnl-2011-000700. Epub 2012 Jan 24.
9
An integrative approach for in silico glioma research.基于计算的脑胶质瘤研究的综合方法。
IEEE Trans Biomed Eng. 2010 Oct;57(10):2617-21. doi: 10.1109/TBME.2010.2060338. Epub 2010 Jul 23.
10
Morphological grayscale reconstruction in image analysis: applications and efficient algorithms.图像分析中的形态学灰度重建:应用与高效算法。
IEEE Trans Image Process. 1993;2(2):176-201. doi: 10.1109/83.217222.