Suppr超能文献

成对结构比对的进化不准确性。

Evolutionary inaccuracy of pairwise structural alignments.

机构信息

Division of Mathematical Biology, MRC National Institute for Medical Research, The Ridgeway, Mill Hill, London, UK.

出版信息

Bioinformatics. 2012 May 1;28(9):1209-15. doi: 10.1093/bioinformatics/bts103. Epub 2012 Mar 6.

Abstract

MOTIVATION

Structural alignment methods are widely used to generate gold standard alignments for improving multiple sequence alignments and transferring functional annotations, as well as for assigning structural distances between proteins. However, the correctness of the alignments generated by these methods is difficult to assess objectively since little is known about the exact evolutionary history of most proteins. Since homology is an equivalence relation, an upper bound on alignment quality can be found by assessing the consistency of alignments. Measuring the consistency of current methods of structure alignment and determining the causes of inconsistencies can, therefore, provide information on the quality of current methods and suggest possibilities for further improvement.

RESULTS

We analyze the self-consistency of seven widely-used structural alignment methods (SAP, TM-align, Fr-TM-align, MAMMOTH, DALI, CE and FATCAT) on a diverse, non-redundant set of 1863 domains from the SCOP database and demonstrate that even for relatively similar proteins the degree of inconsistency of the alignments on a residue level is high (30%). We further show that levels of consistency vary substantially between methods, with two methods (SAP and Fr-TM-align) producing more consistent alignments than the rest. Inconsistency is found to be higher near gaps and for proteins of low structural complexity, as well as for helices. The ability of the methods to identify good structural alignments is also assessed using geometric measures, for which FATCAT (flexible mode) is found to be the best performer despite being highly inconsistent. We conclude that there is substantial scope for improving the consistency of structural alignment methods.

CONTACT

msadows@nimr.mrc.ac.uk

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

结构比对方法被广泛用于生成金标准比对,以改进多序列比对和功能注释的转移,以及用于分配蛋白质之间的结构距离。然而,这些方法生成的比对的正确性很难客观评估,因为大多数蛋白质的精确进化历史知之甚少。由于同源性是一种等价关系,因此通过评估比对的一致性,可以找到比对质量的上限。因此,衡量当前结构比对方法的一致性并确定不一致的原因,可以提供关于当前方法质量的信息,并为进一步改进提供可能性。

结果

我们分析了七种广泛使用的结构比对方法(SAP、TM-align、Fr-TM-align、MAMMOTH、DALI、CE 和 FATCAT)在 SCOP 数据库中多样化的、非冗余的 1863 个结构域上的自一致性,并证明即使对于相对相似的蛋白质,残基水平上比对的不一致程度也很高(30%)。我们进一步表明,方法之间的一致性水平差异很大,两种方法(SAP 和 Fr-TM-align)产生的比对比其他方法更一致。在缺口附近和结构复杂性低的蛋白质以及螺旋处,发现不一致性更高。我们还使用几何度量评估了这些方法识别良好结构比对的能力,尽管 FATCAT(灵活模式)高度不一致,但发现它是表现最好的方法。我们得出结论,在提高结构比对方法的一致性方面还有很大的改进空间。

联系方式

msadows@nimr.mrc.ac.uk

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e6e4/3338010/6c1424a55ad0/bts103f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验