Suppr超能文献

bbcontacts:从直接耦合模式预测 β-折叠配对。

bbcontacts: prediction of β-strand pairing from direct coupling patterns.

机构信息

Gene Center, LMU Munich, Feodor-Lynen-Strasse 25, 81377 Munich, Germany and Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany Gene Center, LMU Munich, Feodor-Lynen-Strasse 25, 81377 Munich, Germany and Max Planck Institute for Biophysical Chemistry, Am Fassberg 11, 37077 Göttingen, Germany.

出版信息

Bioinformatics. 2015 Jun 1;31(11):1729-37. doi: 10.1093/bioinformatics/btv041. Epub 2015 Jan 23.

Abstract

MOTIVATION

It has recently become possible to build reliable de novo models of proteins if a multiple sequence alignment (MSA) of at least 1000 homologous sequences can be built. Methods of global statistical network analysis can explain the observed correlations between columns in the MSA by a small set of directly coupled pairs of columns. Strong couplings are indicative of residue-residue contacts, and from the predicted contacts a structure can be computed. Here, we exploit the structural regularity of paired β-strands that leads to characteristic patterns in the noisy matrices of couplings. The β-β contacts should be detected more reliably than single contacts, reducing the required number of sequences in the MSAs.

RESULTS

bbcontacts predicts β-β contacts by detecting these characteristic patterns in the 2D map of coupling scores using two hidden Markov models (HMMs), one for parallel and one for antiparallel contacts. β-bulges are modelled as indel states. In contrast to existing methods, bbcontacts uses predicted instead of true secondary structure. On a standard set of 916 test proteins, 34% of which have MSAs with < 1000 sequences, bbcontacts achieves 50% precision for contacting β-β residue pairs at 50% recall using predicted secondary structure and 64% precision at 64% recall using true secondary structure, while existing tools achieve around 45% precision at 45% recall using true secondary structure.

AVAILABILITY AND IMPLEMENTATION

bbcontacts is open source software (GNU Affero GPL v3) available at https://bitbucket.org/soedinglab/bbcontacts .

摘要

动机

如果能够构建至少 1000 个同源序列的多重序列比对(MSA),则可以构建可靠的从头蛋白质模型。全局统计网络分析方法可以通过一小部分直接耦合的列对来解释 MSA 中列之间的观测相关性。强耦合表明残基-残基接触,并且可以从预测的接触中计算出结构。在这里,我们利用配对β-链的结构规律性,导致耦合噪声矩阵中出现特征模式。β-β 接触应该比单个接触更可靠地检测到,从而减少 MSA 中所需的序列数量。

结果

bbcontacts 通过使用两个隐马尔可夫模型(HMM)在耦合得分的 2D 图谱中检测这些特征模式来预测β-β接触,一个用于平行接触,一个用于反平行接触。β-凸起被建模为插入/缺失状态。与现有方法不同,bbcontacts 使用预测的而不是真实的二级结构。在标准的 916 个测试蛋白组中,其中 34%的蛋白组具有少于 1000 个序列的 MSA,bbcontacts 使用预测的二级结构达到 50%召回率时接触β-β残基对的精度为 50%,使用真实二级结构达到 64%召回率时的精度为 64%,而现有工具使用真实二级结构达到 45%召回率时的精度约为 45%。

可用性和实现

bbcontacts 是开源软件(GNU Affero GPL v3),可在 https://bitbucket.org/soedinglab/bbcontacts 获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验