从逆协方差预测蛋白质结构域边界。

Prediction of protein domain boundaries from inverse covariances.

机构信息

MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London, United Kingdom.

出版信息

Proteins. 2013 Feb;81(2):253-60. doi: 10.1002/prot.24181. Epub 2012 Oct 16.

DOI:10.1002/prot.24181

PMID:22987736

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3563215/

Abstract

It has been known even since relatively few structures had been solved that longer protein chains often contain multiple domains, which may fold separately and play the role of reusable functional modules found in many contexts. In many structural biology tasks, in particular structure prediction, it is of great use to be able to identify domains within the structure and analyze these regions separately. However, when using sequence data alone this task has proven exceptionally difficult, with relatively little improvement over the naive method of choosing boundaries based on size distributions of observed domains. The recent significant improvement in contact prediction provides a new source of information for domain prediction. We test several methods for using this information including a kernel smoothing-based approach and methods based on building alpha-carbon models and compare performance with a length-based predictor, a homology search method and four published sequence-based predictors: DOMCUT, DomPRO, DLP-SVM, and SCOOBY-DOmain. We show that the kernel-smoothing method is significantly better than the other ab initio predictors when both single-domain and multidomain targets are considered and is not significantly different to the homology-based method. Considering only multidomain targets the kernel-smoothing method outperforms all of the published methods except DLP-SVM. The kernel smoothing method therefore represents a potentially useful improvement to ab initio domain prediction.

摘要

即使在解决的结构相对较少的情况下，人们也已经知道，较长的蛋白质链通常包含多个结构域，这些结构域可能会分别折叠，并在许多情况下充当可重复使用的功能模块。在许多结构生物学任务中，特别是在结构预测中，能够在结构中识别结构域并分别分析这些区域是非常有用的。然而，仅使用序列数据时，这项任务就变得非常困难，与根据观察到的结构域的大小分布选择边界的简单方法相比，几乎没有什么改进。最近接触预测的显著改进为结构域预测提供了新的信息来源。我们测试了几种利用这些信息的方法，包括基于核平滑的方法和基于构建α-碳模型的方法，并将性能与基于长度的预测器、同源搜索方法以及四个已发布的基于序列的预测器进行了比较：DOMCUT、DomPRO、DLP-SVM 和 SCOOBY-DOmain。我们表明，在考虑单结构域和多结构域目标时，核平滑方法明显优于其他从头预测器，与基于同源性的方法没有显著差异。仅考虑多结构域目标时，核平滑方法的性能优于除 DLP-SVM 之外的所有已发布方法。因此，核平滑方法代表了一种对从头预测结构域可能有用的改进方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/daf4/3563215/e931f20781d1/prot0081-0253-f1.jpg

相似文献

Prediction of protein domain boundaries from inverse covariances.从逆协方差预测蛋白质结构域边界。

Proteins. 2013 Feb;81(2):253-60. doi: 10.1002/prot.24181. Epub 2012 Oct 16.

Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model.基于超深度学习模型的蛋白质接触图从头精确预测

PLoS Comput Biol. 2017 Jan 5;13(1):e1005324. doi: 10.1371/journal.pcbi.1005324. eCollection 2017 Jan.

Identifying foldable regions in protein sequence from the hydrophobic signal.从疏水信号中识别蛋白质序列中的可折叠区域。

Nucleic Acids Res. 2008 Feb;36(2):578-88. doi: 10.1093/nar/gkm1070. Epub 2007 Dec 1.

Ab initio and template-based prediction of multi-class distance maps by two-dimensional recursive neural networks.基于二维递归神经网络的多类别距离图的从头预测和基于模板的预测。

BMC Struct Biol. 2009 Jan 30;9:5. doi: 10.1186/1472-6807-9-5.

Prediction of protein domain boundaries from sequence alone.仅从序列预测蛋白质结构域边界。

Protein Sci. 2003 Apr;12(4):696-701. doi: 10.1110/ps.0233103.

DOMAC: an accurate, hybrid protein domain prediction server.DOMAC：一个准确的混合蛋白质结构域预测服务器。

Nucleic Acids Res. 2007 Jul;35(Web Server issue):W354-6. doi: 10.1093/nar/gkm390. Epub 2007 Jun 6.

Improving protein structure similarity searches using domain boundaries based on conserved sequence information.基于保守序列信息利用结构域边界改进蛋白质结构相似性搜索。

BMC Struct Biol. 2009 May 19;9:33. doi: 10.1186/1472-6807-9-33.

AIDA: ab initio domain assembly for automated multi-domain protein structure prediction and domain-domain interaction prediction.AIDA：用于自动多结构域蛋白质结构预测和结构域-结构域相互作用预测的从头结构域组装

Bioinformatics. 2015 Jul 1;31(13):2098-105. doi: 10.1093/bioinformatics/btv092. Epub 2015 Feb 19.

Scooby-domain: prediction of globular domains in protein sequence.史酷比结构域：蛋白质序列中球状结构域的预测

Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W160-3. doi: 10.1093/nar/gki381.

Ab initio and homology based prediction of protein domains by recursive neural networks.利用递归神经网络对蛋白质结构域进行从头预测和基于同源性的预测。

BMC Bioinformatics. 2009 Jun 26;10:195. doi: 10.1186/1471-2105-10-195.

引用本文的文献

ConPlot: web-based application for the visualization of protein contact maps integrated with other data.ConPlot：一个基于网络的应用程序，用于可视化蛋白质接触图，并与其他数据集成。

Bioinformatics. 2021 Sep 9;37(17):2763-2765. doi: 10.1093/bioinformatics/btab049.

Co-evolution techniques are reshaping the way we do structural bioinformatics.协同进化技术正在重塑我们进行结构生物信息学研究的方式。

F1000Res. 2017 Jul 25;6:1224. doi: 10.12688/f1000research.11543.1. eCollection 2017.

Applications of contact predictions to structural biology.接触预测在结构生物学中的应用。

IUCrJ. 2017 Apr 18;4(Pt 3):291-300. doi: 10.1107/S2052252517005115. eCollection 2017 May 1.

Residue contacts predicted by evolutionary covariance extend the application of ab initio molecular replacement to larger and more challenging protein folds.进化协方差预测的残基接触将从头分子置换的应用扩展到更大和更具挑战性的蛋白质折叠。

IUCrJ. 2016 Jun 15;3(Pt 4):259-70. doi: 10.1107/S2052252516008113. eCollection 2016 Jul 1.

Potential DNA binding and nuclease functions of ComEC domains characterized in silico.通过计算机模拟表征的ComEC结构域的潜在DNA结合和核酸酶功能。

Proteins. 2016 Oct;84(10):1431-42. doi: 10.1002/prot.25088. Epub 2016 Jul 1.

PDP-CON: prediction of domain/linker residues in protein sequences using a consensus approach.PDP-CON：使用共识方法预测蛋白质序列中的结构域/连接子残基。

J Mol Model. 2016 Apr;22(4):72. doi: 10.1007/s00894-016-2933-0. Epub 2016 Mar 11.

A Structured-based Model for the Decreased Activity of Ala222Val and Glu429Ala Methylenetetrahydrofolate Reductase (MTHFR) Mutants.基于结构的Ala222Val和Glu429Ala亚甲基四氢叶酸还原酶（MTHFR）突变体活性降低模型。

Bioinformation. 2013 Nov 11;9(18):929-36. doi: 10.6026/97320630009929. eCollection 2013.

本文引用的文献

Structural constraints on the covariance matrix derived from multiple aligned protein sequences.从多个对齐的蛋白质序列中得出的协方差矩阵的结构约束。

PLoS One. 2011;6(12):e28265. doi: 10.1371/journal.pone.0028265. Epub 2011 Dec 5.

Protein 3D structure computed from evolutionary sequence variation.基于进化序列变异计算的蛋白质 3D 结构。

PLoS One. 2011;6(12):e28766. doi: 10.1371/journal.pone.0028766. Epub 2011 Dec 7.

The Pfam protein families database.Pfam 蛋白质家族数据库。

Nucleic Acids Res. 2012 Jan;40(Database issue):D290-301. doi: 10.1093/nar/gkr1065. Epub 2011 Nov 29.

Protein topology from predicted residue contacts.预测残基接触的蛋白质拓扑结构。

Protein Sci. 2012 Feb;21(2):299-305. doi: 10.1002/pro.2002. Epub 2011 Dec 21.

PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments.PSICOV：使用基于稀疏逆协方差估计的大型多重序列比对进行精确结构接触预测。

Bioinformatics. 2012 Jan 15;28(2):184-90. doi: 10.1093/bioinformatics/btr638. Epub 2011 Nov 17.

SMART 7: recent updates to the protein domain annotation resource.SMART 7：蛋白质结构域注释资源的最新更新。

Nucleic Acids Res. 2012 Jan;40(Database issue):D302-5. doi: 10.1093/nar/gkr931. Epub 2011 Nov 3.

Direct correlation analysis improves fold recognition.直接相关性分析提高了折叠识别能力。

Comput Biol Chem. 2011 Oct 12;35(5):323-32. doi: 10.1016/j.compbiolchem.2011.08.002. Epub 2011 Aug 22.

Protein domain assignment from the recurrence of locally similar structures.基于局部相似结构的重复对蛋白质结构域进行分配。

Proteins. 2011 Mar;79(3):853-66. doi: 10.1002/prot.22923. Epub 2010 Dec 22.

A new generation of homology search tools based on probabilistic inference.基于概率推理的新一代同源性搜索工具。

Genome Inform. 2009 Oct;23(1):205-11.

Assessment of domain boundary predictions and the prediction of intramolecular contacts in CASP8.评估域边界预测和 CASP8 中分子内接触的预测。

Proteins. 2009;77 Suppl 9:196-209. doi: 10.1002/prot.22554.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

从逆协方差预测蛋白质结构域边界。

Prediction of protein domain boundaries from inverse covariances.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献