• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用简单的 k- -mer 模型改进蛋白质二级结构预测。

Improving protein secondary structure prediction using a simple k-mer model.

机构信息

Department of Computer Science, University of Bristol, Woodland Road, Bristol BS8 1UB, UK.

出版信息

Bioinformatics. 2010 Mar 1;26(5):596-602. doi: 10.1093/bioinformatics/btq020. Epub 2010 Feb 3.

DOI:10.1093/bioinformatics/btq020
PMID:20130034
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2828123/
Abstract

MOTIVATION

Some first order methods for protein sequence analysis inherently treat each position as independent. We develop a general framework for introducing longer range interactions. We then demonstrate the power of our approach by applying it to secondary structure prediction; under the independence assumption, sequences produced by existing methods can produce features that are not protein like, an extreme example being a helix of length 1. Our goal was to make the predictions from state of the art methods more realistic, without loss of performance by other measures.

RESULTS

Our framework for longer range interactions is described as a k-mer order model. We succeeded in applying our model to the specific problem of secondary structure prediction, to be used as an additional layer on top of existing methods. We achieved our goal of making the predictions more realistic and protein like, and remarkably this also improved the overall performance. We improve the Segment OVerlap (SOV) score by 1.8%, but more importantly we radically improve the probability of the real sequence given a prediction from an average of 0.271 per residue to 0.385. Crucially, this improvement is obtained using no additional information.

AVAILABILITY

http://supfam.cs.bris.ac.uk/kmer

摘要

动机

一些用于蛋白质序列分析的一阶方法本质上将每个位置视为独立的。我们开发了一个引入长程相互作用的通用框架。然后,我们通过将其应用于二级结构预测来展示我们方法的强大功能;在独立性假设下,现有方法生成的序列可能会产生不具有蛋白质特征的特征,一个极端的例子是长度为 1 的螺旋。我们的目标是使最先进方法的预测更加真实,而不会因其他指标而降低性能。

结果

我们的长程相互作用框架描述为 k-mer 阶模型。我们成功地将我们的模型应用于二级结构预测这一具体问题,用作现有方法之上的附加层。我们实现了使预测更加真实和具有蛋白质特征的目标,值得注意的是,这也提高了整体性能。我们将段重叠(SOV)得分提高了 1.8%,但更重要的是,我们将给定预测的真实序列的概率从平均每残基 0.271 显著提高到 0.385。至关重要的是,这一改进是在不使用任何额外信息的情况下实现的。

可用性

http://supfam.cs.bris.ac.uk/kmer

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16f8/2828123/ce93cdef00bf/btq020f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16f8/2828123/92b77e1d8a38/btq020f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16f8/2828123/a3490b36527b/btq020f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16f8/2828123/ce93cdef00bf/btq020f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16f8/2828123/92b77e1d8a38/btq020f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16f8/2828123/a3490b36527b/btq020f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/16f8/2828123/ce93cdef00bf/btq020f3.jpg

相似文献

1
Improving protein secondary structure prediction using a simple k-mer model.利用简单的 k- -mer 模型改进蛋白质二级结构预测。
Bioinformatics. 2010 Mar 1;26(5):596-602. doi: 10.1093/bioinformatics/btq020. Epub 2010 Feb 3.
2
A simple and fast secondary structure prediction method using hidden neural networks.一种使用隐藏神经网络的简单快速二级结构预测方法。
Bioinformatics. 2005 Jan 15;21(2):152-9. doi: 10.1093/bioinformatics/bth487. Epub 2004 Sep 17.
3
The influence of gapped positions in multiple sequence alignments on secondary structure prediction methods.多序列比对中的空位对二级结构预测方法的影响。
Comput Biol Chem. 2004 Dec;28(5-6):351-66. doi: 10.1016/j.compbiolchem.2004.09.005.
4
PSSM-based prediction of DNA binding sites in proteins.基于位置特异性得分矩阵的蛋白质中DNA结合位点预测
BMC Bioinformatics. 2005 Feb 19;6:33. doi: 10.1186/1471-2105-6-33.
5
Improving protein secondary structure prediction based on short subsequences with local structure similarity.基于局部结构相似性的短序列提高蛋白质二级结构预测。
BMC Genomics. 2010 Dec 2;11 Suppl 4(Suppl 4):S4. doi: 10.1186/1471-2164-11-S4-S4.
6
Improving protein secondary structure prediction using a multi-modal BP method.利用多模态 BP 方法改进蛋白质二级结构预测。
Comput Biol Med. 2011 Oct;41(10):946-59. doi: 10.1016/j.compbiomed.2011.08.005. Epub 2011 Aug 30.
7
NdPASA: a novel pairwise protein sequence alignment algorithm that incorporates neighbor-dependent amino acid propensities.NdPASA:一种整合了邻域依赖氨基酸倾向的新型双序列蛋白质序列比对算法。
Proteins. 2005 Feb 15;58(3):628-37. doi: 10.1002/prot.20359.
8
Prediction of beta-turns at over 80% accuracy based on an ensemble of predicted secondary structures and multiple alignments.基于预测的二级结构集合和多重比对,以超过80%的准确率预测β转角。
BMC Bioinformatics. 2008 Oct 10;9:430. doi: 10.1186/1471-2105-9-430.
9
Integrating protein secondary structure prediction and multiple sequence alignment.整合蛋白质二级结构预测与多序列比对。
Curr Protein Pept Sci. 2004 Aug;5(4):249-66. doi: 10.2174/1389203043379675.
10
DOMAC: an accurate, hybrid protein domain prediction server.DOMAC:一个准确的混合蛋白质结构域预测服务器。
Nucleic Acids Res. 2007 Jul;35(Web Server issue):W354-6. doi: 10.1093/nar/gkm390. Epub 2007 Jun 6.

引用本文的文献

1
The statistical power of -mer based aggregative statistics for alignment-free detection of horizontal gene transfer.基于-mer的聚合统计量用于水平基因转移的无比对检测的统计功效。
Synth Syst Biotechnol. 2019 Aug 31;4(3):150-156. doi: 10.1016/j.synbio.2019.08.001. eCollection 2019 Sep.
2
Predicting protein-ligand interactions based on bow-pharmacological space and Bayesian additive regression trees.基于 Bow 药效空间和贝叶斯加法回归树预测蛋白质-配体相互作用。
Sci Rep. 2019 May 22;9(1):7703. doi: 10.1038/s41598-019-43125-6.
3
Large-scale protein function prediction using heterogeneous ensembles.

本文引用的文献

1
Parallel tempering: theory, applications, and new perspectives.并行回火:理论、应用及新视角
Phys Chem Chem Phys. 2005 Dec 7;7(23):3910-6. doi: 10.1039/b509983h.
2
SAM-T08, HMM-based protein structure prediction.SAM-T08,基于隐马尔可夫模型的蛋白质结构预测。
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W492-7. doi: 10.1093/nar/gkp403. Epub 2009 May 29.
3
Profile Comparer: a program for scoring and aligning profile hidden Markov models.轮廓比较器:一个用于对轮廓隐马尔可夫模型进行评分和比对的程序。
使用异构集成进行大规模蛋白质功能预测。
F1000Res. 2018 Sep 28;7. doi: 10.12688/f1000research.16415.1. eCollection 2018.
4
RNAMethPre: A Web Server for the Prediction and Query of mRNA m6A Sites.RNAMethPre:一个用于预测和查询mRNA m6A位点的网络服务器。
PLoS One. 2016 Oct 10;11(10):e0162707. doi: 10.1371/journal.pone.0162707. eCollection 2016.
5
Bayesian model of protein primary sequence for secondary structure prediction.用于二级结构预测的蛋白质一级序列的贝叶斯模型。
PLoS One. 2014 Oct 14;9(10):e109832. doi: 10.1371/journal.pone.0109832. eCollection 2014.
6
Distributions of amino acids suggest that certain residue types more effectively determine protein secondary structure.氨基酸的分布表明,某些残基类型更有效地决定蛋白质的二级结构。
J Mol Model. 2013 Oct;19(10):4337-48. doi: 10.1007/s00894-013-1911-z. Epub 2013 Aug 2.
7
Fast learning optimized prediction methodology (FLOPRED) for protein secondary structure prediction.快速学习优化预测方法(FLOPRED)在蛋白质二级结构预测中的应用。
J Mol Model. 2012 Sep;18(9):4275-89. doi: 10.1007/s00894-012-1410-7. Epub 2012 May 8.
8
Fast side chain replacement in proteins using a coarse-grained approach for evaluating the effects of mutation during evolution.使用粗粒度方法快速替换蛋白质中的侧链,以评估进化过程中突变的影响。
J Mol Evol. 2011 Aug;73(1-2):23-33. doi: 10.1007/s00239-011-9454-3. Epub 2011 Jul 29.
9
A k-mer scheme to predict piRNAs and characterize locust piRNAs.一种用于预测 piRNA 并描述蝗虫 piRNA 的 k-mer 方案。
Bioinformatics. 2011 Mar 15;27(6):771-6. doi: 10.1093/bioinformatics/btr016. Epub 2011 Jan 11.
10
A series of PDB related databases for everyday needs.一系列满足日常需求的与蛋白质数据银行(PDB)相关的数据库。
Nucleic Acids Res. 2011 Jan;39(Database issue):D411-9. doi: 10.1093/nar/gkq1105. Epub 2010 Nov 11.
Bioinformatics. 2008 Nov 15;24(22):2630-1. doi: 10.1093/bioinformatics/btn504. Epub 2008 Oct 9.
4
PREDICT-2ND: a tool for generalized protein local structure prediction.PREDICT - 2ND:一种用于广义蛋白质局部结构预测的工具。
Bioinformatics. 2008 Nov 1;24(21):2453-9. doi: 10.1093/bioinformatics/btn438. Epub 2008 Aug 30.
5
An evolutionary method for learning HMM structure: prediction of protein secondary structure.一种学习隐马尔可夫模型结构的进化方法:蛋白质二级结构预测
BMC Bioinformatics. 2007 Sep 21;8:357. doi: 10.1186/1471-2105-8-357.
6
CONTRAfold: RNA secondary structure prediction without physics-based models.CONTRAfold:无需基于物理模型的RNA二级结构预测
Bioinformatics. 2006 Jul 15;22(14):e90-8. doi: 10.1093/bioinformatics/btl246.
7
RNA secondary structural alignment with conditional random fields.基于条件随机场的RNA二级结构比对
Bioinformatics. 2005 Sep 1;21 Suppl 2:ii237-42. doi: 10.1093/bioinformatics/bti1139.
8
Porter: a new, accurate server for protein secondary structure prediction.波特:一种用于蛋白质二级结构预测的新型精确服务器。
Bioinformatics. 2005 Apr 15;21(8):1719-20. doi: 10.1093/bioinformatics/bti203. Epub 2004 Dec 7.
9
Protein homology detection by HMM-HMM comparison.通过隐马尔可夫模型(HMM)比较进行蛋白质同源性检测。
Bioinformatics. 2005 Apr 1;21(7):951-60. doi: 10.1093/bioinformatics/bti125. Epub 2004 Nov 5.
10
Comparison of probabilistic combination methods for protein secondary structure prediction.蛋白质二级结构预测中概率组合方法的比较。
Bioinformatics. 2004 Nov 22;20(17):3099-107. doi: 10.1093/bioinformatics/bth370. Epub 2004 Jun 24.