Barreiros Willian, Moreira Jeremias, Kurc Tahsin, Kong Jun, Melo Alba C M A, Saltz Joel H, Teodoro George
Department of Computer Science, University of Brasília, Brasília, Brazil.
Department of Biomedical Informatics, Stony Brook University, Stony Brook, New York.
Concurr Comput. 2020 Jan 25;32(2). doi: 10.1002/cpe.5403. Epub 2019 Jun 24.
Parameter sensitivity analysis (SA) is an effective tool to gain knowledge about complex analysis applications and assess the variability in their analysis results. However, it is an expensive process as it requires the execution of the target application multiple times with a large number of different input parameter values. In this work, we propose optimizations to reduce the overall computation cost of SA in the context of analysis applications that segment high-resolution slide tissue images, ie, images with resolutions of 100k × 100k pixels. Two cost-cutting techniques are combined to efficiently execute SA: use of distributed hybrid systems for parallel execution and computation reuse at multiple levels of an analysis pipeline to reduce the amount of computation. These techniques were evaluated using a cancer image analysis workflow on a hybrid cluster with 256 nodes, each with an Intel Phi and a dual socket CPU. Our parallel execution method attained an efficiency of over 90% on 256 nodes. The hybrid execution on the CPU and Intel Phi improved the performance by 2×. Multilevel computation reuse led to performance gains of over 2.9×.
参数敏感性分析(SA)是一种有效的工具,可用于了解复杂分析应用程序,并评估其分析结果的可变性。然而,这是一个成本高昂的过程,因为它需要使用大量不同的输入参数值多次执行目标应用程序。在这项工作中,我们提出了优化方法,以在分割高分辨率幻灯片组织图像(即分辨率为100k×100k像素的图像)的分析应用程序中降低SA的总体计算成本。我们结合了两种削减成本的技术来高效执行SA:使用分布式混合系统进行并行执行,并在分析管道的多个级别进行计算重用,以减少计算量。我们在具有256个节点的混合集群上使用癌症图像分析工作流程对这些技术进行了评估,每个节点都配备了英特尔至强融核处理器和双插槽CPU。我们的并行执行方法在256个节点上的效率超过了90%。在CPU和英特尔至强融核处理器上的混合执行将性能提高了2倍。多级计算重用使性能提升超过2.9倍。