• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于集成学习的具有序列表征的驱动同义突变预测器

Ensemble learning-based predictor for driver synonymous mutation with sequence representation.

作者信息

Bi Chuanmei, Shi Yong, Xia Junfeng, Liang Zhen, Wu Zhiqiang, Xu Kai, Cheng Na

机构信息

School of Biomedical Engineering, Anhui Medical University, Hefei, China.

Institutes of Physical Science and Information Technology, Anhui University, Hefei, China.

出版信息

PLoS Comput Biol. 2025 Jan 6;21(1):e1012744. doi: 10.1371/journal.pcbi.1012744. eCollection 2025 Jan.

DOI:10.1371/journal.pcbi.1012744
PMID:39761306
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11737855/
Abstract

Synonymous mutations, once considered neutral, are now understood to have significant implications for a variety of diseases, particularly cancer. It is indispensable to identify these driver synonymous mutations in human cancers, yet current methods are constrained by data limitations. In this study, we initially investigate the impact of sequence-based features, including DNA shape, physicochemical properties and one-hot encoding of nucleotides, and deep learning-derived features from pre-trained chemical molecule language models based on BERT. Subsequently, we propose EPEL, an effect predictor for synonymous mutations employing ensemble learning. EPEL combines five tree-based models and optimizes feature selection to enhance predictive accuracy. Notably, the incorporation of DNA shape features and deep learning-derived features from chemical molecule represents a pioneering effect in assessing the impact of synonymous mutations in cancer. Compared to existing state-of-the-art methods, EPEL demonstrates superior performance on the independent test dataset. Furthermore, our analysis reveals a significant correlation between effect scores and patient outcomes across various cancer types. Interestingly, while deep learning methods have shown promise in other fields, their DNA sequence representations do not significantly enhance the identification of driver synonymous mutations in this study. Overall, we anticipate that EPEL will facilitate researchers to more precisely target driver synonymous mutations. EPEL is designed with flexibility, allowing users to retrain the prediction model and generate effect scores for synonymous mutations in human cancers. A user-friendly web server for EPEL is available at http://ahmu.EPEL.bio/.

摘要

同义突变曾被认为是中性的,现在人们认识到它们对多种疾病,尤其是癌症具有重大影响。在人类癌症中识别这些驱动性同义突变是必不可少的,但目前的方法受到数据限制的约束。在本研究中,我们首先研究了基于序列的特征的影响,包括DNA形状、物理化学性质和核苷酸的独热编码,以及基于BERT的预训练化学分子语言模型衍生的深度学习特征。随后,我们提出了EPEL,一种采用集成学习的同义突变效应预测器。EPEL结合了五个基于树的模型,并优化了特征选择以提高预测准确性。值得注意的是,纳入DNA形状特征和来自化学分子的深度学习衍生特征在评估癌症中同义突变的影响方面具有开创性作用。与现有的最先进方法相比,EPEL在独立测试数据集上表现出卓越的性能。此外,我们的分析揭示了效应得分与各种癌症类型患者预后之间的显著相关性。有趣的是,虽然深度学习方法在其他领域已显示出前景,但在本研究中它们的DNA序列表示并没有显著增强对驱动性同义突变的识别。总体而言,我们预计EPEL将有助于研究人员更精确地靶向驱动性同义突变。EPEL的设计具有灵活性,允许用户重新训练预测模型并生成人类癌症中同义突变的效应得分。可通过http://ahmu.EPEL.bio/获得一个用户友好的EPEL网络服务器。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/42bacd552e5e/pcbi.1012744.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/81d690f9ddc7/pcbi.1012744.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/990e360481c4/pcbi.1012744.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/e93501d60309/pcbi.1012744.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/37fbcca0a8e5/pcbi.1012744.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/857890efb82f/pcbi.1012744.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/7226bffdb898/pcbi.1012744.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/fcff08089c8f/pcbi.1012744.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/42bacd552e5e/pcbi.1012744.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/81d690f9ddc7/pcbi.1012744.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/990e360481c4/pcbi.1012744.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/e93501d60309/pcbi.1012744.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/37fbcca0a8e5/pcbi.1012744.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/857890efb82f/pcbi.1012744.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/7226bffdb898/pcbi.1012744.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/fcff08089c8f/pcbi.1012744.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f4c8/11737855/42bacd552e5e/pcbi.1012744.g008.jpg

相似文献

1
Ensemble learning-based predictor for driver synonymous mutation with sequence representation.基于集成学习的具有序列表征的驱动同义突变预测器
PLoS Comput Biol. 2025 Jan 6;21(1):e1012744. doi: 10.1371/journal.pcbi.1012744. eCollection 2025 Jan.
2
Effect Predictor of Driver Synonymous Mutations Based on Multi-Feature Fusion and Iterative Feature Representation Learning.基于多特征融合和迭代特征表示学习的驱动同义突变效应预测。
IEEE J Biomed Health Inform. 2024 Feb;28(2):1144-1151. doi: 10.1109/JBHI.2023.3343075. Epub 2024 Feb 5.
3
usDSM: a novel method for deleterious synonymous mutation prediction using undersampling scheme.usDSM:一种使用欠采样方案预测有害同义突变的新方法。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab123.
4
frDSM: An Ensemble Predictor With Effective Feature Representation for Deleterious Synonymous Mutation in Human Genome.frDSM:一种用于人类基因组中有害同义突变的具有有效特征表示的集成预测器。
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):371-377. doi: 10.1109/TCBB.2022.3167468. Epub 2023 Feb 3.
5
Comparison and integration of computational methods for deleterious synonymous mutation prediction.有害同义突变预测的计算方法比较与整合。
Brief Bioinform. 2020 May 21;21(3):970-981. doi: 10.1093/bib/bbz047.
6
Deleterious synonymous mutation identification based on selective ensemble strategy.基于选择性集成策略的有害同义突变识别
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac598.
7
PredDSMC: A predictor for driver synonymous mutations in human cancers.PredDSMC:人类癌症中驱动同义突变的预测因子。
Front Genet. 2023 Mar 27;14:1164593. doi: 10.3389/fgene.2023.1164593. eCollection 2023.
8
FDPSM: Feature-Driven Prediction Modeling of Pathogenic Synonymous Mutations.FDPSM:致病性同义突变的特征驱动预测建模
J Chem Inf Model. 2025 Mar 24;65(6):3064-3076. doi: 10.1021/acs.jcim.4c02139. Epub 2025 Mar 13.
9
CDMPred: a tool for predicting cancer driver missense mutations with high-quality passenger mutations.CDMPred:一种用于预测具有高质量乘客突变的癌症驱动点突变的工具。
PeerJ. 2024 Sep 6;12:e17991. doi: 10.7717/peerj.17991. eCollection 2024.
10
A pan-cancer analysis of synonymous mutations.泛癌症中同义突变的分析。
Nat Commun. 2019 Jun 12;10(1):2569. doi: 10.1038/s41467-019-10489-2.

本文引用的文献

1
Multi-task aquatic toxicity prediction model based on multi-level features fusion.基于多层次特征融合的多任务水生毒性预测模型
J Adv Res. 2025 Feb;68:477-489. doi: 10.1016/j.jare.2024.06.002. Epub 2024 Jun 4.
2
Pathogenicity classification of missense mutations based on deep generative model.基于深度生成模型的错义突变致病性分类。
Comput Biol Med. 2024 Mar;170:107980. doi: 10.1016/j.compbiomed.2024.107980. Epub 2024 Jan 13.
3
SSCRB: Predicting circRNA-RBP Interaction Sites Using a Sequence and Structural Feature-Based Attention Model.
SSCRB:基于序列和结构特征注意力模型预测 circRNA-RBP 相互作用位点。
IEEE J Biomed Health Inform. 2024 Mar;28(3):1762-1772. doi: 10.1109/JBHI.2024.3354121. Epub 2024 Mar 6.
4
Effect Predictor of Driver Synonymous Mutations Based on Multi-Feature Fusion and Iterative Feature Representation Learning.基于多特征融合和迭代特征表示学习的驱动同义突变效应预测。
IEEE J Biomed Health Inform. 2024 Feb;28(2):1144-1151. doi: 10.1109/JBHI.2023.3343075. Epub 2024 Feb 5.
5
Predicting drug-induced liver injury using graph attention mechanism and molecular fingerprints.利用图注意力机制和分子指纹预测药物性肝损伤。
Methods. 2024 Jan;221:18-26. doi: 10.1016/j.ymeth.2023.11.014. Epub 2023 Nov 30.
6
Network-based prediction approach for cancer-specific driver missense mutations using a graph neural network.基于图神经网络的癌症特异性驱动错义突变的网络预测方法。
BMC Bioinformatics. 2023 Oct 10;24(1):383. doi: 10.1186/s12859-023-05507-6.
7
DCAMCP: A deep learning model based on capsule network and attention mechanism for molecular carcinogenicity prediction.DCAMCP:一种基于胶囊网络和注意力机制的深度学习模型,用于分子致癌性预测。
J Cell Mol Med. 2023 Oct;27(20):3117-3126. doi: 10.1111/jcmm.17889. Epub 2023 Jul 31.
8
Structural underpinnings of mutation rate variations in the human genome.人类基因组中突变率变化的结构基础。
Nucleic Acids Res. 2023 Aug 11;51(14):7184-7197. doi: 10.1093/nar/gkad551.
9
4mCBERT: A computing tool for the identification of DNA N4-methylcytosine sites by sequence- and chemical-derived information based on ensemble learning strategies.4mCBERT:一种基于集成学习策略,通过序列和化学衍生信息识别DNA N4-甲基胞嘧啶位点的计算工具。
Int J Biol Macromol. 2023 Mar 15;231:123180. doi: 10.1016/j.ijbiomac.2023.123180. Epub 2023 Jan 13.
10
Investigating cardiotoxicity related with hERG channel blockers using molecular fingerprints and graph attention mechanism.使用分子指纹和图注意力机制研究与hERG通道阻滞剂相关的心脏毒性。
Comput Biol Med. 2023 Feb;153:106464. doi: 10.1016/j.compbiomed.2022.106464. Epub 2022 Dec 20.