Li Zhenzhen, Li Jingwen, Li Sifan, Wang Yangyang, Wang Jihan
Research & Development Institute of Northwestern Polytechnical University in Shenzhen, Shenzhen 518057, China.
Xi'an Key Laboratory of Stem Cell and Regenerative Medicine, Institute of Medical Research, Northwestern Polytechnical University, Xi'an 710072, China.
Biomedicines. 2025 Apr 28;13(5):1067. doi: 10.3390/biomedicines13051067.
The precise diagnosis and classification of acute myeloid leukemia (AML) has important implications for clinical management and medical research. We investigated the expression of protein-coding genes in blood samples from AML patients and controls using The Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx) databases. Subsequently, we applied the feature selection method of the least absolute shrinkage and selection operator (LASSO) to select the optimal gene subset for classifying AML patients and controls as well as between a particular FAB subtype and other subtypes of AML. Using LASSO method, we identified a subset of 101 genes that could effectively distinguish between AML patients and control individuals; these genes included 70 up-regulated and 31 down-regulated genes in AML. Functional annotation and pathway analysis indicated the involvement of these genes in RNA-related pathways, which was also consistent with the epigenetic changes observed in AML. Results from survival analysis revealed that several genes are correlated with the overall survival in AML patients. Additionally, LASSO-based gene subset analysis successfully revealed differences between certain AML subtypes, providing valuable insights into subtype-specific molecular mechanisms and differentiation therapy. This study demonstrated the application of machine learning in genomic data analysis for identifying gene subsets relevant to AML diagnosis and classification, which could aid in improving the understanding of the molecular landscape of AML. The identification of survival-related genes and subtype-specific markers may lead to the identification of novel targets for personalized medicine in the treatment of AML.
急性髓系白血病(AML)的精确诊断和分类对临床管理和医学研究具有重要意义。我们使用癌症基因组图谱(TCGA)和基因型-组织表达(GTEx)数据库,研究了AML患者和对照组血液样本中蛋白质编码基因的表达情况。随后,我们应用最小绝对收缩和选择算子(LASSO)特征选择方法,选择用于区分AML患者和对照组以及特定FAB亚型与其他AML亚型的最佳基因子集。使用LASSO方法,我们鉴定出一个由101个基因组成的子集,该子集能够有效区分AML患者和对照个体;这些基因包括AML中70个上调基因和31个下调基因。功能注释和通路分析表明这些基因参与了与RNA相关的通路,这也与AML中观察到的表观遗传变化一致。生存分析结果显示,有几个基因与AML患者的总生存期相关。此外,基于LASSO的基因子集分析成功揭示了某些AML亚型之间的差异,为亚型特异性分子机制和分化治疗提供了有价值的见解。本研究展示了机器学习在基因组数据分析中的应用,用于识别与AML诊断和分类相关的基因子集,这有助于加深对AML分子格局的理解。生存相关基因和亚型特异性标志物的鉴定可能会为AML治疗中的个性化医疗识别出新的靶点。