Lin Lin, Bao Yongxia
Department of Respiratory and Critical Care Medicine, The Second Affiliated Hospital of Harbin Medical University, Harbin, Heilongjiang, People's Republic of China.
Cancer Biomark. 2025 Jan;42(1):18758592241308756. doi: 10.1177/18758592241308756. Epub 2025 Apr 2.
ObjectiveStudy aims to develop diagnostic and prognostic models for lung adenocarcinoma (LUAD) using Machine learning(ML)algorithms, aiming to enhance clinical decision-making accuracy.MethodsData from The Cancer Genome Atlas (TCGA) for LUAD patients were split into training (n = 196) and test sets (n = 133). Feature selection (Least Absolute Shrinkage and Selection Operator (LASSO), Random Forest (RF), and Support Vector Machine (SVM)) identified miRNAs distinguishing stage I LUAD. Six ML algorithms predicted pulmonary node classification. Model performance was evaluated using Receiver Operating Characteristic (ROC) curve, Precision-Recall (PR) curves, and Error Rates (CE). A prognostic model was constructed using Lasso Cox regression. Risk score plots were generated, and model performance was assessed using Kaplan-Meier (K-M) and time-dependent ROC curves. Functional enrichment analyses investigated miRNA function and mechanism.ResultsThe feature selection results identified five miRNA molecules as distinguishing characteristics between early-stage LUAD and adjacent non-cancerous tissues. A prognostic model using 13 miRNAs predicted poorer outcomes for patients with higher risk scores, supported by time-dependent ROC curves and a nomogram. Functional enrichment analysis identified cancer-related signaling pathways for the biomarkers.ConclusionML identified a diagnostic five-miRNA signature and a prognostic 13-miRNA model for LUAD, both robust and reliable.
本研究旨在利用机器学习(ML)算法开发肺腺癌(LUAD)的诊断和预后模型,以提高临床决策的准确性。
来自癌症基因组图谱(TCGA)的LUAD患者数据被分为训练集(n = 196)和测试集(n = 133)。通过特征选择(最小绝对收缩和选择算子(LASSO)、随机森林(RF)和支持向量机(SVM))确定区分I期LUAD的miRNA。六种ML算法预测肺结节分类。使用受试者工作特征(ROC)曲线、精确召回率(PR)曲线和错误率(CE)评估模型性能。使用Lasso Cox回归构建预后模型。生成风险评分图,并使用Kaplan-Meier(K-M)曲线和时间依赖性ROC曲线评估模型性能。功能富集分析研究miRNA的功能和机制。
特征选择结果确定了五个miRNA分子作为早期LUAD与相邻非癌组织之间的区分特征。使用13个miRNA的预后模型预测,风险评分较高的患者预后较差,时间依赖性ROC曲线和列线图支持这一结果。功能富集分析确定了生物标志物的癌症相关信号通路。
ML确定了LUAD的诊断性五miRNA特征和预后性13-miRNA模型,两者均稳健可靠。