Suppr超能文献

利用灵长类基因组序列估计DNA序列上下文相关的突变率

Estimation of DNA sequence context-dependent mutation rates using primate genomic sequences.

作者信息

Zhang Wei, Bouffard Gerard G, Wallace Susan S, Bond Jeffrey P

机构信息

Department of Medicine, University of Chicago, 515 CLSC, Chicago, IL 60637, USA.

出版信息

J Mol Evol. 2007 Sep;65(3):207-14. doi: 10.1007/s00239-007-9000-5. Epub 2007 Aug 4.

Abstract

It is understood that DNA and amino acid substitution rates are highly sequence context-dependent, e.g., C --> T substitutions in vertebrates may occur much more frequently at CpG sites and that cysteine substitution rates may depend on support of the context for participation in a disulfide bond. Furthermore, many applications rely on quantitative models of nucleotide or amino acid substitution, including phylogenetic inference and identification of amino acid sequence positions involved in functional specificity. We describe quantification of the context dependence of nucleotide substitution rates using baboon, chimpanzee, and human genomic sequence data generated by the NISC Comparative Sequencing Program. Relative mutation rates are reported for the 96 classes of mutations of the form 5' alphabetagamma 3' --> 5' alphadeltagamma 3', where alpha, beta, gamma, and delta are nucleotides and beta not equal delta, based on maximum likelihood calculations. Our results confirm that C --> T substitutions are enhanced at CpG sites compared with other transitions, relatively independent of the identity of the preceding nucleotide. While, as expected, transitions generally occur more frequently than transversions, we find that the most frequent transversions involve the C at CpG sites (CpG transversions) and that their rate is comparable to the rate of transitions at non-CpG sites. A four-class model of the rates of context-dependent evolution of primate DNA sequences, CpG transitions > non-CpG transitions approximately CpG transversions > non-CpG transversions, captures qualitative features of the mutation spectrum. We find that despite qualitative similarity of mutation rates among different genomic regions, there are statistically significant differences.

摘要

据了解,DNA和氨基酸替换率高度依赖于序列上下文,例如,脊椎动物中C→T替换在CpG位点可能发生得更为频繁,并且半胱氨酸替换率可能取决于参与二硫键形成的上下文支持。此外,许多应用依赖于核苷酸或氨基酸替换的定量模型,包括系统发育推断和鉴定涉及功能特异性的氨基酸序列位置。我们使用由国家人类基因组研究所比较测序计划生成的狒狒、黑猩猩和人类基因组序列数据,描述了核苷酸替换率上下文依赖性的量化。基于最大似然计算,报告了形式为5'字母βγ 3'→5'字母δγ 3'的96类突变的相对突变率,其中α、β、γ和δ是核苷酸且β不等于δ。我们的结果证实,与其他转换相比,CpG位点的C→T替换有所增强,相对独立于前一个核苷酸的身份。虽然如预期的那样,转换通常比颠换发生得更频繁,但我们发现最频繁的颠换涉及CpG位点的C(CpG颠换),并且其速率与非CpG位点的转换速率相当。灵长类动物DNA序列上下文依赖性进化速率的四类模型,即CpG转换>非CpG转换≈CpG颠换>非CpG颠换,捕捉了突变谱的定性特征。我们发现,尽管不同基因组区域之间的突变率在质量上相似,但存在统计学上的显著差异。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验