• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

开发一种实验拆分方法,用于基准化 PTM 位点预测器的泛化能力:以赖氨酸甲基组为例。

Development of an experiment-split method for benchmarking the generalization of a PTM site predictor: Lysine methylome as an example.

机构信息

School of Basic Medicine, Qingdao University, Qingdao, China.

College of Life Science, Qingdao University, Qingdao, China.

出版信息

PLoS Comput Biol. 2021 Dec 8;17(12):e1009682. doi: 10.1371/journal.pcbi.1009682. eCollection 2021 Dec.

DOI:10.1371/journal.pcbi.1009682
PMID:34879076
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8687584/
Abstract

Many computational classifiers have been developed to predict different types of post-translational modification sites. Their performances are measured using cross-validation or independent test, in which experimental data from different sources are mixed and randomly split into training and test sets. However, the self-reported performances of most classifiers based on this measure are generally higher than their performances in the application of new experimental data. It suggests that the cross-validation method overestimates the generalization ability of a classifier. Here, we proposed a generalization estimate method, dubbed experiment-split test, where the experimental sources for the training set are different from those for the test set that simulate the data derived from a new experiment. We took the prediction of lysine methylome (Kme) as an example and developed a deep learning-based Kme site predictor (called DeepKme) with outstanding performance. We assessed the experiment-split test by comparing it with the cross-validation method. We found that the performance measured using the experiment-split test is lower than that measured in terms of cross-validation. As the test data of the experiment-split method were derived from an independent experimental source, this method could reflect the generalization of the predictor. Therefore, we believe that the experiment-split method can be applied to benchmark the practical performance of a given PTM model. DeepKme is free accessible via https://github.com/guoyangzou/DeepKme.

摘要

许多计算分类器已经被开发出来,用于预测不同类型的翻译后修饰位点。它们的性能通过交叉验证或独立测试来衡量,其中来自不同来源的实验数据被混合并随机分为训练集和测试集。然而,基于这种度量的大多数分类器的自我报告性能通常高于它们在新实验数据应用中的性能。这表明交叉验证方法高估了分类器的泛化能力。在这里,我们提出了一种泛化估计方法,称为实验拆分测试,其中训练集的实验来源与测试集的实验来源不同,模拟来自新实验的数据。我们以赖氨酸甲基组(Kme)的预测为例,开发了一种基于深度学习的 Kme 位点预测器(称为 DeepKme),该预测器具有出色的性能。我们通过将其与交叉验证方法进行比较来评估实验拆分测试。我们发现,使用实验拆分测试测量的性能低于交叉验证的性能。由于实验拆分方法的测试数据来自独立的实验来源,因此该方法可以反映预测器的泛化能力。因此,我们认为实验拆分方法可用于基准测试给定 PTM 模型的实际性能。DeepKme 可通过 https://github.com/guoyangzou/DeepKme 免费访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/f4204cdace96/pcbi.1009682.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/fc6dbff570b0/pcbi.1009682.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/57f7816e4532/pcbi.1009682.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/c350c97c157a/pcbi.1009682.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/fce896e66a33/pcbi.1009682.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/cfe55556c5f4/pcbi.1009682.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/6a09117b2cc9/pcbi.1009682.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/f4204cdace96/pcbi.1009682.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/fc6dbff570b0/pcbi.1009682.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/57f7816e4532/pcbi.1009682.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/c350c97c157a/pcbi.1009682.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/fce896e66a33/pcbi.1009682.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/cfe55556c5f4/pcbi.1009682.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/6a09117b2cc9/pcbi.1009682.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9a9/8687584/f4204cdace96/pcbi.1009682.g007.jpg

相似文献

1
Development of an experiment-split method for benchmarking the generalization of a PTM site predictor: Lysine methylome as an example.开发一种实验拆分方法,用于基准化 PTM 位点预测器的泛化能力:以赖氨酸甲基组为例。
PLoS Comput Biol. 2021 Dec 8;17(12):e1009682. doi: 10.1371/journal.pcbi.1009682. eCollection 2021 Dec.
2
Large-scale comparative assessment of computational predictors for lysine post-translational modification sites.大规模比较评估赖氨酸翻译后修饰位点的计算预测因子。
Brief Bioinform. 2019 Nov 27;20(6):2267-2290. doi: 10.1093/bib/bby089.
3
PhoglyStruct: Prediction of phosphoglycerylated lysine residues using structural properties of amino acids.PhoglyStruct:基于氨基酸结构性质预测磷酸甘油化赖氨酸残基。
Sci Rep. 2018 Dec 18;8(1):17923. doi: 10.1038/s41598-018-36203-8.
4
A deep learning method to more accurately recall known lysine acetylation sites.一种更准确地召回已知赖氨酸乙酰化位点的深度学习方法。
BMC Bioinformatics. 2019 Jan 23;20(1):49. doi: 10.1186/s12859-019-2632-9.
5
DeepDN_iGlu: prediction of lysine glutarylation sites based on attention residual learning method and DenseNet.DeepDN_iGlu:基于注意力残差学习方法和 DenseNet 的赖氨酸瓜氨酸化位点预测。
Math Biosci Eng. 2023 Jan;20(2):2815-2830. doi: 10.3934/mbe.2023132. Epub 2022 Dec 1.
6
GBDT_KgluSite: An improved computational prediction model for lysine glutarylation sites based on feature fusion and GBDT classifier.GBDT_KgluSite:一种基于特征融合和 GBDT 分类器的赖氨酸谷氨酰化位点改进计算预测模型。
BMC Genomics. 2023 Dec 11;24(1):765. doi: 10.1186/s12864-023-09834-z.
7
Bigram-PGK: phosphoglycerylation prediction using the technique of bigram probabilities of position specific scoring matrix.双元模型-PGK:基于位置特异得分矩阵双元概率技术的磷酸甘油酰化预测。
BMC Mol Cell Biol. 2019 Dec 20;20(Suppl 2):57. doi: 10.1186/s12860-019-0240-1.
8
RMTLysPTM: recognizing multiple types of lysine PTM sites by deep analysis on sequences.RMTLysPTM:通过对序列进行深度分析来识别多种类型的赖氨酸翻译后修饰位点
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad450.
9
DeepSSPred: A Deep Learning Based Sulfenylation Site Predictor Via a Novel nSegmented Optimize Federated Feature Encoder.DeepSSPred:一种基于深度学习的新型 nSegmented Optimize 联邦特征编码器的硫化位点预测器。
Protein Pept Lett. 2021;28(6):708-721. doi: 10.2174/0929866527666201202103411.
10
DeepPPSite: A deep learning-based model for analysis and prediction of phosphorylation sites using efficient sequence information.DeepPPSite:一种基于深度学习的模型,用于利用有效的序列信息分析和预测磷酸化位点。
Anal Biochem. 2021 Jan 1;612:113955. doi: 10.1016/j.ab.2020.113955. Epub 2020 Sep 16.

引用本文的文献

1
Determining structures of RNA conformers using AFM and deep neural networks.利用原子力显微镜和深度神经网络确定RNA构象异构体的结构。
Nature. 2025 Jan;637(8048):1234-1243. doi: 10.1038/s41586-024-07559-x. Epub 2024 Dec 18.
2
Discriminant analysis using MRI asymmetry indices and cognitive scores of women with temporal lobe epilepsy or schizophrenia.使用MRI不对称指数和颞叶癫痫或精神分裂症女性认知评分的判别分析。
Neuroradiology. 2024 Jul;66(7):1083-1092. doi: 10.1007/s00234-024-03317-y. Epub 2024 Feb 28.
3
Determining structures of individual RNA conformers using atomic force microscopy images and deep neural networks.

本文引用的文献

1
DeepCSO: A Deep-Learning Network Approach to Predicting Cysteine S-Sulphenylation Sites.DeepCSO:一种用于预测半胱氨酸S-亚磺酰化位点的深度学习网络方法。
Front Cell Dev Biol. 2020 Dec 1;8:594587. doi: 10.3389/fcell.2020.594587. eCollection 2020.
2
DeepKhib: A Deep-Learning Framework for Lysine 2-Hydroxyisobutyrylation Sites Prediction.DeepKhib:一种用于赖氨酸2-羟基异丁酰化位点预测的深度学习框架。
Front Cell Dev Biol. 2020 Sep 9;8:580217. doi: 10.3389/fcell.2020.580217. eCollection 2020.
3
Assessing predictors for new post translational modification sites: A case study on hydroxylation.
利用原子力显微镜图像和深度神经网络确定单个RNA构象异构体的结构。
Res Sq. 2023 Jun 7:rs.3.rs-2798658. doi: 10.21203/rs.3.rs-2798658/v1.
评估新的翻译后修饰位点的预测因子:以羟基化为例的案例研究。
PLoS Comput Biol. 2020 Jun 22;16(6):e1007967. doi: 10.1371/journal.pcbi.1007967. eCollection 2020 Jun.
4
MusiteDeep: a deep-learning based webserver for protein post-translational modification site prediction and visualization.MusiteDeep:一个基于深度学习的蛋白质翻译后修饰位点预测和可视化的网络服务器。
Nucleic Acids Res. 2020 Jul 2;48(W1):W140-W146. doi: 10.1093/nar/gkaa275.
5
Characterization and Identification of Lysine Succinylation Sites based on Deep Learning Method.基于深度学习方法的赖氨酸琥珀酰化修饰位点的鉴定与特征分析。
Sci Rep. 2019 Nov 7;9(1):16175. doi: 10.1038/s41598-019-52552-4.
6
Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites.深度学习分类器与随机森林方法相结合,用于预测丙二酰化位点。
Genomics Proteomics Bioinformatics. 2018 Dec;16(6):451-459. doi: 10.1016/j.gpb.2018.08.004. Epub 2019 Jan 11.
7
dbPTM in 2019: exploring disease association and cross-talk of post-translational modifications.dbPTM 于 2019 年:探索翻译后修饰的疾病关联和串扰。
Nucleic Acids Res. 2019 Jan 8;47(D1):D298-D308. doi: 10.1093/nar/gky1074.
8
BERMP: a cross-species classifier for predicting mA sites by integrating a deep learning algorithm and a random forest approach.BERMP:一种跨物种的 mA 位点预测分类器,它集成了深度学习算法和随机森林方法。
Int J Biol Sci. 2018 Sep 7;14(12):1669-1677. doi: 10.7150/ijbs.27819. eCollection 2018.
9
Putting benchmarks in their rightful place: The heart of computational biology.将基准置于适当位置:计算生物学的核心。
PLoS Comput Biol. 2018 Nov 8;14(11):e1006494. doi: 10.1371/journal.pcbi.1006494. eCollection 2018 Nov.
10
Affinity Purification of Methyllysine Proteome by Site-Specific Covalent Conjugation.通过定点共价偶联对甲基赖氨酸蛋白质组进行亲和纯化。
Anal Chem. 2018 Dec 4;90(23):13876-13881. doi: 10.1021/acs.analchem.8b02796. Epub 2018 Nov 16.