Suppr超能文献

使用对隐马尔可夫模型探索大型猿类基因组中的短程模板切换。

Short-range template switching in great ape genomes explored using pair hidden Markov models.

机构信息

European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, United Kingdom.

Department of Genetics, University of Cambridge, Cambridge, United Kingdom.

出版信息

PLoS Genet. 2021 Mar 2;17(3):e1009221. doi: 10.1371/journal.pgen.1009221. eCollection 2021 Mar.

Abstract

Many complex genomic rearrangements arise through template switch errors, which occur in DNA replication when there is a transient polymerase switch to an alternate template nearby in three-dimensional space. While typically investigated at kilobase-to-megabase scales, the genomic and evolutionary consequences of this mutational process are not well characterised at smaller scales, where they are often interpreted as clusters of independent substitutions, insertions and deletions. Here we present an improved statistical approach using pair hidden Markov models, and use it to detect and describe short-range template switches underlying clusters of mutations in the multi-way alignment of hominid genomes. Using robust statistics derived from evolutionary genomic simulations, we show that template switch events have been widespread in the evolution of the great apes' genomes and provide a parsimonious explanation for the presence of many complex mutation clusters in their phylogenetic context. Larger-scale mechanisms of genome rearrangement are typically associated with structural features around breakpoints, and accordingly we show that atypical patterns of secondary structure formation and DNA bending are present at the initial template switch loci. Our methods improve on previous non-probabilistic approaches for computational detection of template switch mutations, allowing the statistical significance of events to be assessed. By specifying realistic evolutionary parameters based on the genomes and taxa involved, our methods can be readily adapted to other intra- or inter-species comparisons.

摘要

许多复杂的基因组重排是通过模板转换错误产生的,这种错误发生在 DNA 复制过程中,当聚合酶在三维空间中短暂切换到附近的替代模板时。虽然通常在千碱基到百万碱基的范围内进行研究,但这种突变过程的基因组和进化后果在较小的范围内并没有得到很好的描述,在较小的范围内,它们通常被解释为独立替换、插入和缺失的簇。在这里,我们提出了一种使用对隐藏马尔可夫模型的改进统计方法,并使用它来检测和描述人类基因组多序列比对中突变簇下的短距离模板转换。使用来自进化基因组模拟的稳健统计数据,我们表明,在大型猿类基因组的进化过程中,模板转换事件已经广泛存在,并为其进化背景下许多复杂突变簇的存在提供了一种简约的解释。较大规模的基因组重排机制通常与断点周围的结构特征有关,因此我们表明,在初始模板转换位点存在不典型的二级结构形成和 DNA 弯曲模式。我们的方法改进了以前用于计算检测模板转换突变的非概率方法,允许评估事件的统计显著性。通过根据涉及的基因组和分类单元指定现实的进化参数,我们的方法可以很容易地适应其他种内或种间比较。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b00/7954356/1fd9bdb067a5/pgen.1009221.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验