Department of Hepatobiliary Disease, The Third People's Hospital of Fujian University of Traditional Chinese Medicine, Fuzhou, China.
Radiodiagnostic Department, The 900 Hospital of the Joint Service Support Force of the People's Liberation Army of China, Fuzhou, China.
PeerJ. 2024 Nov 14;12:e18310. doi: 10.7717/peerj.18310. eCollection 2024.
Lung adenocarcinoma (LUAD) is a widely occurring cancer with a high death rate. Radiomics, as a high-throughput method, has a wide range of applications in different aspects of the management of multiple cancers. However, the molecular mechanism of LUAD by combining transcriptomics and radiomics in order to probe LUAD remains unclear.
The transcriptome data and radiomics features of LUAD were extracted from the public database. Subsequently, we used weighted gene co-expression network analysis (WGCNA) and a series of machine learning algorithms including Random Forest (RF), Least Absolute Shrinkage and Selection Operator (LASSO) logistic regression, and Support Vector Machines Recursive Feature Elimination (SVM-RFE) to proceed with the screening of diagnostic genes for LUAD. In addition, the CIBERSORT and ESTIMATE algorithms were utilized to assess the association of these genes with immune profiles. The LASSO algorithm further identified the features most relevant to the expression levels of LUAD diagnostic genes and validated the model based on receiver operating characteristic (ROC), precision-recall (PR), calibration curves and decision curve analysis (DCA) curves. Finally, RT-qPCR, transwell and cell counting kit-8 (CCK8) based assays were performed to assess the expression levels and potential functions of the screened genes in LUAD cell lines.
We screened a total of 214 modular genes with the highest correlation with LUAD samples based on WGCNA, of which 192 genes were shown to be highly expressed in LUAD patients. Subsequently, three machine learning algorithms identified a total of four genes, including UBE2T, TEDC2, RCC1, and FAM136A, as diagnostic molecules for LUAD, and the ROC curves showed that these diagnostic molecules had good diagnostic performance (AUC values of 0.989, 0.989, 989, and 0.987, respectively). The expression of these diagnostic molecules was significantly higher in tumor samples than in normal para-cancerous tissue samples and also correlated significantly and negatively with stromal and immune scores. Specifically, we also constructed a model based on TEDC2 expression consisting of seven radiomic features. Among them, the ROC and PR curves showed that the model had an AUC value of up to 0.96, respectively. Knockdown of TEDC2 slowed down the proliferation, migration and invasion efficiency of LUAD cell lines.
In this study, we screened for diagnostic markers of LUAD and developed a non-invasive radiomics model by innovatively combining transcriptomics and radiomics data. These findings contribute to our understanding of LUAD biology and offer potential avenues for further exploration in clinical practice.
肺腺癌 (LUAD) 是一种广泛发生的癌症,死亡率很高。放射组学作为一种高通量方法,在多种癌症的管理的不同方面都有广泛的应用。然而,为了探究 LUAD,将转录组学和放射组学相结合的 LUAD 分子机制尚不清楚。
从公共数据库中提取 LUAD 的转录组数据和放射组学特征。随后,我们使用加权基因共表达网络分析 (WGCNA) 和一系列机器学习算法,包括随机森林 (RF)、最小绝对收缩和选择算子 (LASSO) 逻辑回归以及支持向量机递归特征消除 (SVM-RFE),对 LUAD 的诊断基因进行筛选。此外,利用 CIBERSORT 和 ESTIMATE 算法评估这些基因与免疫特征的关联。LASSO 算法进一步确定了与 LUAD 诊断基因表达水平最相关的特征,并基于接收者操作特征 (ROC)、精度-召回率 (PR)、校准曲线和决策曲线分析 (DCA) 曲线对模型进行验证。最后,通过 RT-qPCR、transwell 和细胞计数试剂盒-8 (CCK8) 检测筛选出的基因在 LUAD 细胞系中的表达水平和潜在功能。
我们通过 WGCNA 共筛选出与 LUAD 样本相关性最高的 214 个模块基因,其中 192 个基因在 LUAD 患者中呈高表达。随后,三种机器学习算法共鉴定出 UBE2T、TEDC2、RCC1 和 FAM136A 四个基因作为 LUAD 的诊断分子,ROC 曲线显示这些诊断分子具有良好的诊断性能(AUC 值分别为 0.989、0.989、0.989 和 0.987)。这些诊断分子在肿瘤样本中的表达明显高于癌旁正常组织样本,且与基质和免疫评分显著负相关。具体来说,我们还基于 TEDC2 表达构建了一个包含七个放射组学特征的模型。其中,ROC 和 PR 曲线显示模型的 AUC 值高达 0.96。TEDC2 的敲低减缓了 LUAD 细胞系的增殖、迁移和侵袭效率。
本研究筛选出 LUAD 的诊断标志物,并创新性地结合转录组学和放射组学数据,建立了一种非侵入性的放射组学模型。这些发现有助于我们理解 LUAD 的生物学特性,并为临床实践提供了进一步探索的潜在途径。