Teodoro George, Kurç Tahsin M, Taveira Luís F R, Melo Alba C M A, Gao Yi, Kong Jun, Saltz Joel H
Department of Computer Science, University of Brasília, Brasília 70910-900, Brazil.
Biomedical Informatics Department, Stony Brook University, Stony Brook, NY 11794-8322, USA.
Bioinformatics. 2017 Apr 1;33(7):1064-1072. doi: 10.1093/bioinformatics/btw749.
Sensitivity analysis and parameter tuning are important processes in large-scale image analysis. They are very costly because the image analysis workflows are required to be executed several times to systematically correlate output variations with parameter changes or to tune parameters. An integrated solution with minimum user interaction that uses effective methodologies and high performance computing is required to scale these studies to large imaging datasets and expensive analysis workflows.
The experiments with two segmentation workflows show that the proposed approach can (i) quickly identify and prune parameters that are non-influential; (ii) search a small fraction (about 100 points) of the parameter search space with billions to trillions of points and improve the quality of segmentation results (Dice and Jaccard metrics) by as much as 1.42× compared to the results from the default parameters; (iii) attain good scalability on a high performance cluster with several effective optimizations.
Our work demonstrates the feasibility of performing sensitivity analyses, parameter studies and auto-tuning with large datasets. The proposed framework can enable the quantification of error estimations and output variations in image segmentation pipelines.
Source code: https://github.com/SBU-BMI/region-templates/ .
Supplementary data are available at Bioinformatics online.
敏感性分析和参数调整是大规模图像分析中的重要过程。它们成本高昂,因为图像分析工作流程需要执行多次,以便系统地将输出变化与参数变化相关联或调整参数。需要一种具有最少用户交互的集成解决方案,该方案使用有效的方法和高性能计算,以便将这些研究扩展到大型成像数据集和昂贵的分析工作流程。
对两种分割工作流程进行的实验表明,所提出的方法能够:(i)快速识别并剔除无影响的参数;(ii)在包含数十亿到数万亿个点的参数搜索空间中搜索一小部分(约100个点),与默认参数的结果相比,将分割结果的质量(骰子系数和杰卡德系数)提高多达1.42倍;(iii)通过一些有效的优化,在高性能集群上实现良好的可扩展性。
我们的工作证明了对大型数据集进行敏感性分析、参数研究和自动调整的可行性。所提出的框架能够对图像分割管道中的误差估计和输出变化进行量化。
源代码:https://github.com/SBU-BMI/region-templates/ 。
补充数据可在《生物信息学》在线获取。