Department of Oncology, Hangzhou Normal University, Affiliated Hospital, Hangzhou, 310015, Zhejiang, China.
Department of Oncology, Shaoxing Cent Hospital, Shaoxing, 312030, Zhejiang, China.
Sci Rep. 2023 Nov 27;13(1):20930. doi: 10.1038/s41598-023-47659-8.
Lung adenocarcinoma (LUAD) is one of the most widespread and fatal types of lung cancer. Oxidative stress, resulting from an imbalance in the production and accumulation of reactive oxygen species (ROS), is considered a promising therapeutic target for cancer treatment. Currently, immune checkpoint blockade (ICB) therapy is being explored as a potentially effective treatment for early-stage LUAD. In this research, we aim to identify distinct subtypes of LUAD patients by investigating genes associated with oxidative stress and immunotherapy. Additionally, we aim to propose subtype-specific therapeutic strategies. We conducted a thorough search of the Gene Expression Omnibus (GEO) datasets. From this search, we pinpointed datasets that contained both expression data and survival information. We selected genes associated with oxidative stress and immunotherapy using keyword searches on GeneCards. We then combined expression data of LUAD samples from both The Cancer Genome Atlas (TCGA) and 11 GEO datasets, forming a unified dataset. This dataset was subsequently divided into two subsets, Dataset_Training and Dataset_Testing, using a random bifurcation method, with each subset containing 50% of the data. We applied consensus clustering (CC) analysis to identify distinct LUAD subtypes within the Dataset_Training. Molecular variances associated with oxidative stress levels, the tumor microenvironment (TME), and immune checkpoint genes (ICGs) were then investigated among these subtypes. Employing feature selection combined with machine learning techniques, we constructed models that achieved the highest accuracy levels. We validated the identified subtypes and models from Dataset_Training using Dataset_Testing. A hub gene with the highest importance values in the machine learning model was identified. We then utilized virtual screening to discover potential compounds targeting this hub gene. In the unified dataset, we integrated 2,154 LUAD samples from TCGA-LUAD and 11 GEO datasets. We specifically selected 1,311 genes associated with immune and oxidative stress processes. The expression data of these genes were then employed for subtype identification through CC analysis. Within Dataset_Training, two distinct subtypes emerged, each marked by different levels of immune and oxidative stress pathway values. Consequently, we named these as the OX and IM subtypes. Notably, the OX subtype showed increased oxidative stress levels, correlating with a worse prognosis than the IM subtype. Conversely, the IM subtype demonstrated enhanced levels of immune pathways, immune cells, and ICGs compared to the OX subtype. We reconfirmed these findings in Dataset_Testing. Through gene selection, we identified an optimal combination of 12 genes for predicting LUAD subtypes: ACP1, AURKA, BIRC5, CYC1, GSTP1, HSPD1, HSPE1, MDH2, MRPL13, NDUFS1, SNRPD1, and SORD. Out of the four machine learning models we tested, the support vector machine (SVM) stood out, achieving the highest area under the curve (AUC) of 0.86 and an accuracy of 0.78 on Dataset_Testing. We focused on HSPE1, which was designated as the hub gene due to its paramount importance in the SVM model, and computed the docking structures for four compounds: ZINC3978005 (Dihydroergotamine), ZINC52955754 (Ergotamine), ZINC150588351 (Elbasvir), and ZINC242548690 (Digoxin). Our study identified two subtypes of LUAD patients based on oxidative stress and immunotherapy-related genes. Our findings provided subtype-specific therapeutic strategies.
肺腺癌 (LUAD) 是最广泛和致命的肺癌类型之一。氧化应激是由于活性氧 (ROS) 的产生和积累失衡引起的,被认为是癌症治疗的有前途的治疗靶点。目前,免疫检查点阻断 (ICB) 疗法被探索作为早期 LUAD 的潜在有效治疗方法。在这项研究中,我们旨在通过研究与氧化应激和免疫疗法相关的基因来鉴定不同的 LUAD 患者亚型。此外,我们旨在提出亚型特异性的治疗策略。我们进行了全面的基因表达综合数据库 (GEO) 数据集搜索。通过此次搜索,我们确定了包含表达数据和生存信息的数据集。我们使用 GeneCards 上的关键字搜索选择与氧化应激和免疫疗法相关的基因。然后,我们将来自癌症基因组图谱 (TCGA) 和 11 个 GEO 数据集的 LUAD 样本的表达数据组合在一起,形成一个统一的数据集。随后,使用随机分叉方法将该数据集分为两个子集,即数据集训练集和数据集测试集,每个子集包含数据的 50%。我们应用共识聚类 (CC) 分析来识别数据集训练集中的不同 LUAD 亚型。然后研究这些亚型中与氧化应激水平、肿瘤微环境 (TME) 和免疫检查点基因 (ICGs) 相关的分子差异。使用特征选择与机器学习技术相结合,我们构建了达到最高精度水平的模型。我们使用数据集测试集验证了数据集训练集中识别的亚型和模型。在机器学习模型中确定了具有最高重要性值的枢纽基因。然后,我们利用虚拟筛选来发现针对该枢纽基因的潜在化合物。在统一数据集中,我们整合了来自 TCGA-LUAD 和 11 个 GEO 数据集的 2154 个 LUAD 样本。我们特别选择了 1311 个与免疫和氧化应激过程相关的基因。然后,我们使用这些基因的表达数据通过 CC 分析进行亚型鉴定。在数据集训练集中,出现了两个不同的亚型,每个亚型的免疫和氧化应激途径值都不同。因此,我们将这些命名为 OX 和 IM 亚型。值得注意的是,OX 亚型表现出更高的氧化应激水平,与 IM 亚型相比预后更差。相反,IM 亚型与 OX 亚型相比,表现出更高水平的免疫途径、免疫细胞和 ICGs。我们在数据集测试集中重新确认了这些发现。通过基因选择,我们确定了预测 LUAD 亚型的最佳 12 基因组合:ACP1、AURKA、BIRC5、CYC1、GSTP1、HSPD1、HSPE1、MDH2、MRPL13、NDUFS1、SNRPD1 和 SORD。在我们测试的四个机器学习模型中,支持向量机 (SVM) 脱颖而出,在数据集测试集上达到了 0.86 的最高曲线下面积 (AUC) 和 0.78 的准确率。我们专注于 HSPE1,由于其在 SVM 模型中的重要性,它被指定为枢纽基因,并计算了四种化合物的对接结构:ZINC3978005(二氢麦角胺)、ZINC52955754(麦角胺)、ZINC150588351(Elbasvir)和 ZINC242548690(地高辛)。我们的研究基于氧化应激和免疫治疗相关基因鉴定了两种 LUAD 患者亚型。我们的研究结果提供了亚型特异性的治疗策略。