Suppr超能文献

肿瘤学中用于预后评分的人工智能:一项基准研究。

Artificial Intelligence for Prognostic Scores in Oncology: a Benchmarking Study.

作者信息

Loureiro Hugo, Becker Tim, Bauer-Mehren Anna, Ahmidi Narges, Weberpals Janick

机构信息

Data Science, Pharmaceutical Research and Early Development Informatics (pREDi), Roche Innovation Center Munich (RICM), Penzberg, Germany.

Institute of Computational Biology, Helmholtz Zentrum Munich, Munich, Germany.

出版信息

Front Artif Intell. 2021 Apr 16;4:625573. doi: 10.3389/frai.2021.625573. eCollection 2021.

Abstract

Prognostic scores are important tools in oncology to facilitate clinical decision-making based on patient characteristics. To date, classic survival analysis using Cox proportional hazards regression has been employed in the development of these prognostic scores. With the advance of analytical models, this study aimed to determine if more complex machine-learning algorithms could outperform classical survival analysis methods. In this benchmarking study, two datasets were used to develop and compare different prognostic models for overall survival in pan-cancer populations: a nationwide EHR-derived de-identified database for training and in-sample testing and the OAK (phase III clinical trial) dataset for out-of-sample testing. A real-world database comprised 136K first-line treated cancer patients across multiple cancer types and was split into a 90% training and 10% testing dataset, respectively. The OAK dataset comprised 1,187 patients diagnosed with non-small cell lung cancer. To assess the effect of the covariate number on prognostic performance, we formed three feature sets with 27, 44 and 88 covariates. In terms of methods, we benchmarked ROPRO, a prognostic score based on the Cox model, against eight complex machine-learning models: regularized Cox, Random Survival Forests (RSF), Gradient Boosting (GB), DeepSurv (DS), Autoencoder (AE) and Super Learner (SL). The C-index was used as the performance metric to compare different models. For in-sample testing on the real-world database the resulting C-index [95% CI] values for RSF 0.720 [0.716, 0.725], GB 0.722 [0.718, 0.727], DS 0.721 [0.717, 0.726] and lastly, SL 0.723 [0.718, 0.728] showed significantly better performance as compared to ROPRO 0.701 [0.696, 0.706]. Similar results were derived across all feature sets. However, for the out-of-sample validation on OAK, the stronger performance of the more complex models was not apparent anymore. Consistently, the increase in the number of prognostic covariates did not lead to an increase in model performance. The stronger performance of the more complex models did not generalize when applied to an out-of-sample dataset. We hypothesize that future research may benefit by adding multimodal data to exploit advantages of more complex models.

摘要

预后评分是肿瘤学中基于患者特征促进临床决策的重要工具。迄今为止,在这些预后评分的开发中采用了使用Cox比例风险回归的经典生存分析。随着分析模型的发展,本研究旨在确定更复杂的机器学习算法是否能优于经典生存分析方法。在这项基准研究中,使用了两个数据集来开发和比较泛癌人群总生存的不同预后模型:一个用于训练和样本内测试的全国性电子健康记录衍生的去识别数据库,以及用于样本外测试的OAK(III期临床试验)数据集。一个真实世界数据库包含13.6万名接受一线治疗的多种癌症类型患者,并分别拆分为90%的训练数据集和10%的测试数据集。OAK数据集包含1187名被诊断为非小细胞肺癌的患者。为了评估协变量数量对预后性能的影响,我们形成了具有27、44和88个协变量的三个特征集。在方法方面,我们将基于Cox模型的预后评分ROPRO与八个复杂机器学习模型进行了基准测试:正则化Cox、随机生存森林(RSF)、梯度提升(GB)、深度生存(DS)、自动编码器(AE)和超级学习器(SL)。使用C指数作为性能指标来比较不同模型。对于真实世界数据库的样本内测试,RSF的C指数[95%置信区间]值为0.720[0.716,0.725],GB为0.722[0.718,0.727],DS为0.721[0.717,0.726],最后SL为0.723[0.718,0.728],与ROPRO的0.701[0.696,0.706]相比,表现出显著更好的性能。在所有特征集中都得出了类似的结果。然而,对于OAK的样本外验证,更复杂模型的更强性能不再明显。一致地,预后协变量数量的增加并未导致模型性能的提高。当应用于样本外数据集时,更复杂模型的更强性能并未普遍适用。我们假设未来的研究可能通过添加多模态数据来利用更复杂模型的优势而受益。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90cc/8086599/607b19f76680/frai-04-625573-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验