Kidney Research Center, Tabriz University of Medical Sciences, Tabriz, Iran.
Rahat Breath and Sleep Research Center, Tabriz University of Medical Science, Tabriz, Iran.
PLoS One. 2024 Aug 16;19(8):e0308531. doi: 10.1371/journal.pone.0308531. eCollection 2024.
Breast cancer, a global concern predominantly impacting women, poses a significant threat when not identified early. While survival rates for breast cancer patients are typically favorable, the emergence of regional metastases markedly diminishes survival prospects. Detecting metastases and comprehending their molecular underpinnings are crucial for tailoring effective treatments and improving patient survival outcomes.
Various artificial intelligence methods and techniques were employed in this study to achieve accurate outcomes. Initially, the data was organized and underwent hold-out cross-validation, data cleaning, and normalization. Subsequently, feature selection was conducted using ANOVA and binary Particle Swarm Optimization (PSO). During the analysis phase, the discriminative power of the selected features was evaluated using machine learning classification algorithms. Finally, the selected features were considered, and the SHAP algorithm was utilized to identify the most significant features for enhancing the decoding of dominant molecular mechanisms in lymph node metastases.
In this study, five main steps were followed for the analysis of mRNA expression data: reading, preprocessing, feature selection, classification, and SHAP algorithm. The RF classifier utilized the candidate mRNAs to differentiate between negative and positive categories with an accuracy of 61% and an AUC of 0.6. During the SHAP process, intriguing relationships between the selected mRNAs and positive/negative lymph node status were discovered. The results indicate that GDF5, BAHCC1, LCN2, FGF14-AS2, and IDH2 are among the top five most impactful mRNAs based on their SHAP values.
The prominent identified mRNAs including GDF5, BAHCC1, LCN2, FGF14-AS2, and IDH2, are implicated in lymph node metastasis. This study holds promise in elucidating a thorough insight into key candidate genes that could significantly impact the early detection and tailored therapeutic strategies for lymph node metastasis in patients with breast cancer.
乳腺癌是一种全球性的疾病,主要影响女性,早期发现对患者至关重要。虽然乳腺癌患者的生存率通常较高,但区域转移的出现显著降低了患者的生存前景。检测转移灶并了解其分子基础对于制定有效的治疗方案和改善患者的生存结局至关重要。
本研究采用了各种人工智能方法和技术,以实现准确的结果。首先,对数据进行组织,并进行了留一交叉验证、数据清理和归一化。然后,使用方差分析(ANOVA)和二进制粒子群优化(PSO)进行特征选择。在分析阶段,使用机器学习分类算法评估所选特征的判别能力。最后,考虑选择的特征,并使用 SHAP 算法识别对增强淋巴结转移中主要分子机制解码最有意义的特征。
在这项研究中,对 mRNA 表达数据的分析遵循了五个主要步骤:读取、预处理、特征选择、分类和 SHAP 算法。RF 分类器利用候选 mRNA 区分阴性和阳性类别,准确率为 61%,AUC 为 0.6。在 SHAP 过程中,发现了所选 mRNA 与阳性/阴性淋巴结状态之间的有趣关系。结果表明,GDF5、BAHCC1、LCN2、FGF14-AS2 和 IDH2 是基于 SHAP 值的五个最具影响力的 mRNA 之一。
本研究鉴定出了一些重要的 mRNA,包括 GDF5、BAHCC1、LCN2、FGF14-AS2 和 IDH2,它们与淋巴结转移有关。本研究有望深入了解关键候选基因,这些基因可能对乳腺癌患者淋巴结转移的早期检测和个体化治疗策略产生重大影响。