Suppr超能文献

序列整合对齐提高了祖先序列重建的准确性。

Alignment-Integrated Reconstruction of Ancestral Sequences Improves Accuracy.

机构信息

Department of Microbiology and Cell Science, Institute of Food and Agricultural Sciences, University of Florida.

出版信息

Genome Biol Evol. 2020 Sep 1;12(9):1549-1565. doi: 10.1093/gbe/evaa164.

Abstract

Ancestral sequence reconstruction (ASR) uses an alignment of extant protein sequences, a phylogeny describing the history of the protein family and a model of the molecular-evolutionary process to infer the sequences of ancient proteins, allowing researchers to directly investigate the impact of sequence evolution on protein structure and function. Like all statistical inferences, ASR can be sensitive to violations of its underlying assumptions. Previous studies have shown that, whereas phylogenetic uncertainty has only a very weak impact on ASR accuracy, uncertainty in the protein sequence alignment can more strongly affect inferred ancestral sequences. Here, we show that errors in sequence alignment can produce errors in ASR across a range of realistic and simplified evolutionary scenarios. Importantly, sequence reconstruction errors can lead to errors in estimates of structural and functional properties of ancestral proteins, potentially undermining the reliability of analyses relying on ASR. We introduce an alignment-integrated ASR approach that combines information from many different sequence alignments. We show that integrating alignment uncertainty improves ASR accuracy and the accuracy of downstream structural and functional inferences, often performing as well as highly accurate structure-guided alignment. Given the growing evidence that sequence alignment errors can impact the reliability of ASR studies, we recommend that future studies incorporate approaches to mitigate the impact of alignment uncertainty. Probabilistic modeling of insertion and deletion events has the potential to radically improve ASR accuracy when the model reflects the true underlying evolutionary history, but further studies are required to thoroughly evaluate the reliability of these approaches under realistic conditions.

摘要

祖先序列重建(ASR)利用现存蛋白质序列的比对、描述蛋白质家族历史的系统发育以及分子进化过程的模型来推断古代蛋白质的序列,使研究人员能够直接研究序列进化对蛋白质结构和功能的影响。与所有统计推断一样,ASR 可能对违反其基本假设很敏感。先前的研究表明,尽管系统发育不确定性对 ASR 准确性的影响非常微弱,但蛋白质序列比对中的不确定性更能强烈影响推断出的祖先序列。在这里,我们表明,在一系列现实和简化的进化场景中,序列比对中的错误会在 ASR 中产生错误。重要的是,序列重建错误会导致对祖先蛋白质结构和功能特性的估计出现错误,从而可能破坏依赖 ASR 的分析的可靠性。我们引入了一种整合序列比对的 ASR 方法,该方法结合了来自许多不同序列比对的信息。我们表明,整合比对不确定性可以提高 ASR 准确性和下游结构和功能推断的准确性,通常表现得与高度准确的结构指导对齐一样好。鉴于越来越多的证据表明序列比对错误会影响 ASR 研究的可靠性,我们建议未来的研究采用减轻比对不确定性影响的方法。当模型反映真实的基础进化历史时,插入和缺失事件的概率建模有可能从根本上提高 ASR 准确性,但需要进一步的研究来彻底评估这些方法在现实条件下的可靠性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cae5/7523730/121782efb019/evaa164f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验