整合临床信息和推测的非整倍体评分以增强早期肺癌患者的复发预测。

Integration of Clinical Information and Imputed Aneuploidy Scores to Enhance Relapse Prediction in Early Stage Lung Cancer Patients.

机构信息

Data Science Institute, NUI Galway, Galway, Ireland.

Insight Centre for Data Analytics, NUI Galway, Galway, Ireland.

出版信息

AMIA Annu Symp Proc. 2023 Apr 29;2022:1062-1071. eCollection 2022.

PMID:37128408

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10148374/

Abstract

Early-stage lung cancer is crucial clinically due to its insidious nature and rapid progression. Most of the prediction models designed to predict tumour recurrence in the early stage of lung cancer rely on the clinical or medical history of the patient. However, their performance could likely be improved if the input patient data contained genomic information. Unfortunately, such data is not always collected. This is the main motivation of our work, in which we have imputed and integrated specific type of genomic data with clinical data to increase the accuracy of machine learning models for prediction of relapse in early-stage, non-small cell lung cancer patients. Using a publicly available TCGA lung adenocarcinoma cohort of 501 patients, their aneuploidy scores were imputed into similar records in the Spanish Lung Cancer Group (SLCG) data, more specifically a cohort of 1348 early-stage patients. First, the tumor recurrence in those patients was predicted without the imputed aneuploidy scores. Then, the SLCG data were enriched with the aneuploidy scores imputed from TCGA. This integrative approach improved the prediction of the relapse risk, achieving area under the precision-recall curve (PR-AUC) score of 0.74, and area under the ROC (ROC-AUC) score of 0.79. Using the prediction explanation model SHAP (SHapley Additive exPlanations), we further explained the predictions performed by the machine learning model. We conclude that our explainable predictive model is a promising tool for oncologists that addresses an unmet clinical need of post-treatment patient stratification based on the relapse risk, while also improving the predictive power by incorporating proxy genomic data not available for the actual specific patients.

摘要

早期肺癌由于其隐匿性和快速进展的特点，在临床上至关重要。大多数旨在预测早期肺癌肿瘤复发的预测模型都依赖于患者的临床或病史。然而，如果输入的患者数据包含基因组信息，它们的性能可能会得到提高。不幸的是，并非总是会收集此类数据。这就是我们工作的主要动机，我们已经对特定类型的基因组数据进行了推断和整合，并与临床数据相结合，以提高机器学习模型预测早期非小细胞肺癌患者复发的准确性。使用公共可用的 TCGA 肺腺癌队列中的 501 名患者，将他们的非整倍性评分推断到西班牙肺癌组（SLCG）数据中的相似记录中，更具体地说是 1348 名早期患者的队列中。首先，在没有推断出的非整倍性评分的情况下，预测了这些患者的肿瘤复发情况。然后，用 TCGA 推断的非整倍性评分丰富了 SLCG 数据。这种综合方法提高了复发风险的预测，实现了精度召回曲线下面积（PR-AUC）评分 0.74 和 ROC 曲线下面积（ROC-AUC）评分 0.79。使用可解释性预测模型 SHAP（SHapley Additive exPlanations），我们进一步解释了机器学习模型的预测。我们得出结论，我们的可解释预测模型是肿瘤学家的一种有前途的工具，它满足了基于复发风险对治疗后患者进行分层的未满足的临床需求，同时通过整合实际特定患者不可用的代理基因组数据来提高预测能力。

相似文献

Integration of Clinical Information and Imputed Aneuploidy Scores to Enhance Relapse Prediction in Early Stage Lung Cancer Patients.

AMIA Annu Symp Proc. 2023 Apr 29;2022:1062-1071. eCollection 2022.

Synergy between imputed genetic pathway and clinical information for predicting recurrence in early stage non-small cell lung cancer.

J Biomed Inform. 2023 Aug;144:104424. doi: 10.1016/j.jbi.2023.104424. Epub 2023 Jun 21.

Boosting predictive models and augmenting patient data with relevant genomic and pathway information.

Comput Biol Med. 2024 May;174:108398. doi: 10.1016/j.compbiomed.2024.108398. Epub 2024 Apr 3.

Machine Learning-Assisted Recurrence Prediction for Patients With Early-Stage Non-Small-Cell Lung Cancer.

JCO Clin Cancer Inform. 2023 Jul;7:e2200062. doi: 10.1200/CCI.22.00062.

Machine learning approaches for prediction of early death among lung cancer patients with bone metastases using routine clinical characteristics: An analysis of 19,887 patients.

Front Public Health. 2022 Oct 6;10:1019168. doi: 10.3389/fpubh.2022.1019168. eCollection 2022.

NCC-AUC: an AUC optimization method to identify multi-biomarker panel for cancer prognosis from genomic and clinical data.

Bioinformatics. 2015 Oct 15;31(20):3330-8. doi: 10.1093/bioinformatics/btv374. Epub 2015 Jun 18.

Development and validation of a novel mRNA signature for predicting early relapse in non-small cell lung cancer.

Jpn J Clin Oncol. 2021 Aug 1;51(8):1277-1286. doi: 10.1093/jjco/hyab075.

How platinum-induced nephrotoxicity occurs? Machine learning prediction in non-small cell lung cancer patients.

Comput Methods Programs Biomed. 2022 Jun;221:106839. doi: 10.1016/j.cmpb.2022.106839. Epub 2022 Apr 26.

Development of a Novel Intra-Operative Score to Record Diseases' Anatomic Fingerprints (ANAFI Score) for the Prediction of Complete Cytoreduction in Advanced-Stage Ovarian Cancer by Using Machine Learning and Explainable Artificial Intelligence.

Cancers (Basel). 2023 Feb 3;15(3):966. doi: 10.3390/cancers15030966.

Machine Learning Models for Predicting Influential Factors of Early Outcomes in Acute Ischemic Stroke: Registry-Based Study.

JMIR Med Inform. 2022 Mar 25;10(3):e32508. doi: 10.2196/32508.

引用本文的文献

Effectiveness of Artificial Intelligence Models in Predicting Lung Cancer Recurrence: A Gene Biomarker-Driven Review.

Cancers (Basel). 2025 Jun 5;17(11):1892. doi: 10.3390/cancers17111892.

本文引用的文献

A Genomic-Pathologic Annotated Risk Model to Predict Recurrence in Early-Stage Lung Adenocarcinoma.

JAMA Surg. 2021 Feb 1;156(2):e205601. doi: 10.1001/jamasurg.2020.5601. Epub 2021 Feb 10.

Codon optimization with deep learning to enhance protein expression.

Sci Rep. 2020 Oct 19;10(1):17617. doi: 10.1038/s41598-020-74091-z.

Identification of Aneuploid Circulating Tumor Cells in Soft-Tissue Sarcoma Patients: A Pilot Study.

Oncology. 2020;98(12):893-896. doi: 10.1159/000509326. Epub 2020 Aug 20.

The major effects of health-related quality of life on 5-year survival prediction among lung cancer survivors: applications of machine learning.

Sci Rep. 2020 Jul 1;10(1):10693. doi: 10.1038/s41598-020-67604-3.

Integrating genomic features for non-invasive early lung cancer detection.

Nature. 2020 Apr;580(7802):245-251. doi: 10.1038/s41586-020-2140-0. Epub 2020 Mar 25.

Overall survival prediction of non-small cell lung cancer by integrating microarray and clinical data with deep learning.

Sci Rep. 2020 Mar 13;10(1):4679. doi: 10.1038/s41598-020-61588-w.

Distributed learning on 20 000+ lung cancer patients - The Personal Health Train.

Radiother Oncol. 2020 Mar;144:189-200. doi: 10.1016/j.radonc.2019.11.019. Epub 2020 Jan 3.

Longitudinal molecular trajectories of diffuse glioma in adults.

Nature. 2019 Dec;576(7785):112-120. doi: 10.1038/s41586-019-1775-1. Epub 2019 Nov 20.

Lung Cancer Survival Prediction via Machine Learning Regression, Classification, and Statistical Techniques.

Proc IEEE Int Symp Signal Proc Inf Tech. 2018 Dec;2018:632-637. doi: 10.1109/ISSPIT.2018.8642753. Epub 2019 Feb 18.

Aneuploidy drives lethal progression in prostate cancer.

Proc Natl Acad Sci U S A. 2019 Jun 4;116(23):11390-11395. doi: 10.1073/pnas.1902645116. Epub 2019 May 13.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

整合临床信息和推测的非整倍体评分以增强早期肺癌患者的复发预测。

Integration of Clinical Information and Imputed Aneuploidy Scores to Enhance Relapse Prediction in Early Stage Lung Cancer Patients.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献