Suppr超能文献

基于深度学习的计算方法用于预测化生性乳腺癌诊断中的非编码RNA-疾病关联

Deep learning-based computational approach for predicting ncRNAs-disease associations in metaplastic breast cancer diagnosis.

作者信息

Ahmad Saleem, Zafar Imran, Shafiq Shaista, Sehar Laila, Khalil Hafsa, Matloob Nida, Hina Mehvish, Muntaha Sidra Tul, Khan Hamid, Khan Najeeb Ullah, Rana Samreen, Unar Ahsanullah, Azmat Muhammad, Shafiq Muhammad, Jardan Yousef A Bin, Dauelbait Musaab, Bourhia Mohammed

机构信息

Department of Cell Biology and Physiology, University of Kansas Medical Center, Kansas City, KS, 66160, USA.

Department of Biochemistry and Biotechnology, Faculty of Science, The University of Faisalabad (TUF), Faisalabad, Punjab, Pakistan.

出版信息

BMC Cancer. 2025 May 6;25(1):830. doi: 10.1186/s12885-025-14113-z.

Abstract

Non-coding RNAs (ncRNAs) play a crucial role in breast cancer progression, necessitating advanced computational approaches for precise disease classification. This study introduces a Deep Reinforcement Learning (DRL)-based framework for predicting ncRNA-disease associations in metaplastic breast cancer (MBC) using a multi-dimensional descriptor system (ncRNADS) integrating 550 sequence-based features and 1,150 target gene descriptors (miRDB score ≥ 90). The model achieved 96.20% accuracy, 96.48% precision, 96.10% recall, and a 96.29% F1-score, outperforming traditional classifiers such as support vector machines (SVM) and neural networks. Feature selection and optimization reduced dimensionality by 42.5% (4,430 to 2,545 features) while maintaining high accuracy, demonstrating computational efficiency. External validation confirmed model specificity to breast cancer subtypes (87-96.5% accuracy) and minimal cross-reactivity with unrelated diseases like Alzheimer's (8-9% accuracy), ensuring robustness. SHAP analysis identified key sequence motifs (e.g., "UUG") and structural free energy (ΔG = - 12.3 kcal/mol) as critical predictors, validated by PCA (82% variance) and t-SNE clustering. Survival analysis using TCGA data revealed prognostic significance for MALAT1, HOTAIR, and NEAT1 (associated with poor survival, HR = 1.76-2.71) and GAS5 (protective effect, HR = 0.60). The DRL model demonstrated rapid training (0.08 s/epoch) and cloud deployment compatibility, underscoring its scalability for large-scale applications. These findings establish ncRNA-driven classification as a cornerstone for precision oncology, enabling patient stratification, survival prediction, and therapeutic target identification in MBC.

摘要

非编码RNA(ncRNAs)在乳腺癌进展中起着关键作用,因此需要先进的计算方法来进行精确的疾病分类。本研究引入了一种基于深度强化学习(DRL)的框架,用于使用整合了550个基于序列的特征和1,150个靶基因描述符(miRDB评分≥90)的多维描述符系统(ncRNADS)来预测化生性乳腺癌(MBC)中的ncRNA-疾病关联。该模型的准确率达到96.20%,精确率达到96.48%,召回率达到96.10%,F1分数为96.29%,优于支持向量机(SVM)和神经网络等传统分类器。特征选择和优化在保持高精度的同时将维度降低了42.5%(从4,430个特征降至2,545个特征),证明了计算效率。外部验证证实了该模型对乳腺癌亚型的特异性(准确率87-96.5%)以及与阿尔茨海默病等无关疾病的最小交叉反应性(准确率8-9%),确保了稳健性。SHAP分析确定了关键序列基序(如“UUG”)和结构自由能(ΔG = -12.3 kcal/mol)为关键预测因子,并通过主成分分析(82%方差)和t-SNE聚类进行了验证。使用TCGA数据进行的生存分析揭示了MALAT1、HOTAIR和NEAT1的预后意义(与不良生存相关,HR = 1.76-2.71)以及GAS5的保护作用(HR = 0.60)。DRL模型展示了快速训练(0.08秒/轮次)和云部署兼容性,突出了其在大规模应用中的可扩展性。这些发现将ncRNA驱动的分类确立为精准肿瘤学的基石,能够在MBC中实现患者分层、生存预测和治疗靶点识别。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3119/12053860/aad1695118c5/12885_2025_14113_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验