Suppr超能文献

改良后的 ComBat 用于多中心研究中放射组学特征的调和性能比较。

Performance comparison of modified ComBat for harmonization of radiomic features for multicenter studies.

机构信息

INSERM, UMR 1101, LaTIM, University of Brest, Brest, France.

Department of Radiation Oncology, Institut de cancérologie de l'Ouest René-Gauducheau, Saint-Herblain, France.

出版信息

Sci Rep. 2020 Jun 24;10(1):10248. doi: 10.1038/s41598-020-66110-w.

Abstract

Multicenter studies are needed to demonstrate the clinical potential value of radiomics as a prognostic tool. However, variability in scanner models, acquisition protocols and reconstruction settings are unavoidable and radiomic features are notoriously sensitive to these factors, which hinders pooling them in a statistical analysis. A statistical harmonization method called ComBat was developed to deal with the "batch effect" in gene expression microarray data and was used in radiomics studies to deal with the "center-effect". Our goal was to evaluate modifications in ComBat allowing for more flexibility in choosing a reference and improving robustness of the estimation. Two modified ComBat versions were evaluated: M-ComBat allows to transform all features distributions to a chosen reference, instead of the overall mean, providing more flexibility. B-ComBat adds bootstrap and Monte Carlo for improved robustness in the estimation. BM-ComBat combines both modifications. The four versions were compared regarding their ability to harmonize features in a multicenter context in two different clinical datasets. The first contains 119 locally advanced cervical cancer patients from 3 centers, with magnetic resonance imaging and positron emission tomography imaging. In that case ComBat was applied with 3 labels corresponding to each center. The second one contains 98 locally advanced laryngeal cancer patients from 5 centers with contrast-enhanced computed tomography. In that specific case, because imaging settings were highly heterogeneous even within each of the five centers, unsupervised clustering was used to determine two labels for applying ComBat. The impact of each harmonization was evaluated through three different machine learning pipelines for the modelling step in predicting the clinical outcomes, across two performance metrics (balanced accuracy and Matthews correlation coefficient). Before harmonization, almost all radiomic features had significantly different distributions between labels. These differences were successfully removed with all ComBat versions. The predictive ability of the radiomic models was always improved with harmonization and the improved ComBat provided the best results. This was observed consistently in both datasets, through all machine learning pipelines and performance metrics. The proposed modifications allow for more flexibility and robustness in the estimation. They also slightly but consistently improve the predictive power of resulting radiomic models.

摘要

需要进行多中心研究,以证明放射组学作为一种预后工具的临床潜在价值。然而,扫描仪型号、采集协议和重建设置的可变性是不可避免的,放射组学特征通常对这些因素非常敏感,这阻碍了它们在统计分析中进行汇总。一种名为 ComBat 的统计协调方法是为了处理基因表达微阵列数据中的“批次效应”而开发的,并在放射组学研究中用于处理“中心效应”。我们的目标是评估 ComBat 的修改,以使其在选择参考方面更具灵活性,并提高估计的稳健性。评估了两种修改后的 ComBat 版本:M-ComBat 允许将所有特征分布转换为选择的参考,而不是整体平均值,从而提供更大的灵活性。B-ComBat 添加了引导和蒙特卡罗模拟,以提高估计的稳健性。BM-ComBat 结合了这两种修改。在两种不同的临床数据集的多中心背景下,比较了这四个版本在协调特征方面的能力。第一个数据集包含来自 3 个中心的 119 名局部晚期宫颈癌患者,有磁共振成像和正电子发射断层扫描成像。在这种情况下,ComBat 应用了 3 个与每个中心对应的标签。第二个数据集包含来自 5 个中心的 98 名局部晚期喉癌患者,有对比增强的计算机断层扫描。在这种特殊情况下,由于即使在每个中心内部成像设置也高度不均匀,因此使用无监督聚类来确定用于应用 ComBat 的两个标签。通过三种不同的机器学习管道来评估每种协调的影响,用于预测临床结果的建模步骤,通过两个性能指标(平衡准确性和马修斯相关系数)。在协调之前,几乎所有的放射组学特征在标签之间的分布都有显著差异。所有 ComBat 版本都成功地消除了这些差异。通过协调,放射组学模型的预测能力总是得到提高,而改进的 ComBat 提供了最好的结果。这在两个数据集、所有机器学习管道和性能指标中都是一致的。所提出的修改在估计方面提供了更大的灵活性和稳健性。它们还略微但一致地提高了生成的放射组学模型的预测能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/afec/7314795/c9657b264039/41598_2020_66110_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验