Suppr超能文献

无需系统发育树就能检测协同进化?不依赖系统发育树的协同进化指标与依赖系统发育树的指标表现相当。

Detecting coevolution without phylogenetic trees? Tree-ignorant metrics of coevolution perform as well as tree-aware metrics.

作者信息

Caporaso J Gregory, Smit Sandra, Easton Brett C, Hunter Lawrence, Huttley Gavin A, Knight Rob

机构信息

Department of Chemistry and Biochemistry, University of Colorado at Boulder, Boulder, CO, USA.

出版信息

BMC Evol Biol. 2008 Dec 3;8:327. doi: 10.1186/1471-2148-8-327.

Abstract

BACKGROUND

Identifying coevolving positions in protein sequences has myriad applications, ranging from understanding and predicting the structure of single molecules to generating proteome-wide predictions of interactions. Algorithms for detecting coevolving positions can be classified into two categories: tree-aware, which incorporate knowledge of phylogeny, and tree-ignorant, which do not. Tree-ignorant methods are frequently orders of magnitude faster, but are widely held to be insufficiently accurate because of a confounding of shared ancestry with coevolution. We conjectured that by using a null distribution that appropriately controls for the shared-ancestry signal, tree-ignorant methods would exhibit equivalent statistical power to tree-aware methods. Using a novel t-test transformation of coevolution metrics, we systematically compared four tree-aware and five tree-ignorant coevolution algorithms, applying them to myoglobin and myosin. We further considered the influence of sequence recoding using reduced-state amino acid alphabets, a common tactic employed in coevolutionary analyses to improve both statistical and computational performance.

RESULTS

Consistent with our conjecture, the transformed tree-ignorant metrics (particularly Mutual Information) often outperformed the tree-aware metrics. Our examination of the effect of recoding suggested that charge-based alphabets were generally superior for identifying the stabilizing interactions in alpha helices. Performance was not always improved by recoding however, indicating that the choice of alphabet is critical.

CONCLUSION

The results suggest that t-test transformation of tree-ignorant metrics can be sufficient to control for patterns arising from shared ancestry.

摘要

背景

识别蛋白质序列中共同进化的位点有众多应用,从理解和预测单分子结构到生成全蛋白质组范围内的相互作用预测。检测共同进化位点的算法可分为两类:考虑系统发育的树感知算法和不考虑系统发育的树忽略算法。树忽略方法通常快几个数量级,但由于共同祖先与共同进化的混淆,普遍认为其准确性不足。我们推测,通过使用适当控制共同祖先信号的零分布,树忽略方法将表现出与树感知方法相当的统计功效。我们使用共同进化指标的一种新颖的t检验变换,系统地比较了四种树感知和五种树忽略共同进化算法,并将它们应用于肌红蛋白和肌球蛋白。我们还进一步考虑了使用简化状态氨基酸字母表进行序列重新编码的影响,这是共同进化分析中常用的一种策略,用于提高统计和计算性能。

结果

与我们的推测一致,变换后的树忽略指标(特别是互信息)通常优于树感知指标。我们对重新编码效果的研究表明,基于电荷的字母表通常在识别α螺旋中的稳定相互作用方面更具优势。然而,重新编码并不总是能提高性能,这表明字母表的选择至关重要。

结论

结果表明,对树忽略指标进行t检验变换足以控制由共同祖先产生的模式。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/83ae/2637866/de7b870fbdf2/1471-2148-8-327-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验