对拟南芥位点频率谱进行分析，并对祖先错误推断进行上下文相关校正。

Analysis of site frequency spectra from Arabidopsis with context-dependent corrections for ancestral misinference.

作者信息

Morton Brian R, Dar Vaqaar-un-Nisa, Wright Stephen I

机构信息

Department of Biological Science, Barnard College, Columbia University, New York, New York 10027, USA.

出版信息

Plant Physiol. 2009 Feb;149(2):616-24. doi: 10.1104/pp.108.127787. Epub 2008 Nov 19.

DOI:10.1104/pp.108.127787

PMID:19019983

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2633827/

Abstract

Previous studies have shown that the pattern of single nucleotide polymorphism (SNP) in Arabidopsis (Arabidopsis thaliana) deviates from the distribution expected under a neutral model. Here, we test whether or not ancestral misinference could explain this deviation. We start by showing that there are significant and complex influences of context on mutation dynamics as inferred from SNP frequency, in Arabidopsis, and compare the results to observations about context dependency that have been made on a previous analysis of a maize (Zea mays) SNP dataset. The data concerning heterogeneity across sites are then used to make corrections for ancestral misinference in a context-dependent manner. Using Arabidopsis lyrata to infer the ancestral state for SNPs, we show that the resulting unfolded site frequency spectrum (SFS) in Arabidopsis is skewed toward sites with high frequency derived nucleotides. Sites are also partitioned into two general functional classes, second codon position and 4-fold degenerate sites. These two classes show different SFS; although both show an overrepresentation of high frequency derived sites, low frequency derived sites are vastly overrepresented at the second codon position, but significantly underrepresented at 4-fold degenerate sites. We find that these results are robust to corrections for ancestral misinference, even when context-dependent variation in mutation properties is taken into consideration. The data suggest that, in addition to purifying selection, complex demographic events and/or linked positive selection need to be invoked to explain the SFS, and they highlight the importance of sequence context in analyses of genome-wide variation.

摘要

先前的研究表明，拟南芥（Arabidopsis thaliana）中的单核苷酸多态性（SNP）模式偏离了中性模型下预期的分布。在此，我们测试祖先推断错误是否能够解释这种偏差。我们首先表明，从拟南芥SNP频率推断，上下文对突变动态存在显著且复杂的影响，并将结果与先前对玉米（Zea mays）SNP数据集分析中关于上下文依赖性的观察结果进行比较。然后，利用有关位点间异质性的数据，以依赖上下文的方式对祖先推断错误进行校正。利用琴叶拟南芥（Arabidopsis lyrata）推断SNP的祖先状态，我们发现拟南芥中得到的展开位点频率谱（SFS）偏向于具有高频衍生核苷酸的位点。位点也被分为两个一般功能类别，即第二密码子位置和4倍简并位点。这两个类别显示出不同的SFS；尽管两者都显示高频衍生位点的过度代表，但低频衍生位点在第二密码子位置大幅过度代表，而在4倍简并位点显著不足代表。我们发现，即使考虑到突变特性的上下文依赖性变化，这些结果对于祖先推断错误的校正也是稳健的。数据表明，除了纯化选择外，还需要引入复杂的群体事件和/或连锁正选择来解释SFS，并且它们突出了序列上下文在全基因组变异分析中的重要性。

相似文献

Analysis of site frequency spectra from Arabidopsis with context-dependent corrections for ancestral misinference.对拟南芥位点频率谱进行分析，并对祖先错误推断进行上下文相关校正。

Plant Physiol. 2009 Feb;149(2):616-24. doi: 10.1104/pp.108.127787. Epub 2008 Nov 19.

Genomic variations and distinct evolutionary rate of rare alleles in Arabidopsis thaliana.拟南芥中罕见等位基因的基因组变异与独特进化速率

BMC Evol Biol. 2016 Jan 27;16:25. doi: 10.1186/s12862-016-0590-7.

Selective constraints on codon usage of nuclear genes from Arabidopsis thaliana.拟南芥核基因密码子使用的选择限制

Mol Biol Evol. 2007 Jan;24(1):122-9. doi: 10.1093/molbev/msl139. Epub 2006 Oct 4.

Transcription-related mutations and GC content drive variation in nucleotide substitution rates across the genomes of Arabidopsis thaliana and Arabidopsis lyrata.转录相关突变和GC含量驱动拟南芥和琴叶拟南芥基因组中核苷酸替换率的变异。

BMC Evol Biol. 2007 Apr 23;7:66. doi: 10.1186/1471-2148-7-66.

Genome-wide analysis on the maize genome reveals weak selection on synonymous mutations.全基因组分析揭示玉米基因组中同义突变的选择较弱。

BMC Genomics. 2020 Apr 29;21(1):333. doi: 10.1186/s12864-020-6745-3.

Rates and patterns of molecular evolution in inbred and outbred Arabidopsis.自交和杂交拟南芥的分子进化速率及模式

Mol Biol Evol. 2002 Sep;19(9):1407-20. doi: 10.1093/oxfordjournals.molbev.a004204.

Selection on amino acid substitutions in Arabidopsis.拟南芥中氨基酸替换的选择

Mol Biol Evol. 2008 Jul;25(7):1375-83. doi: 10.1093/molbev/msn079. Epub 2008 Apr 4.

Genomic determinants of protein evolution and polymorphism in Arabidopsis.拟南芥蛋白进化和多态性的基因组决定因素。

Genome Biol Evol. 2011;3:1210-9. doi: 10.1093/gbe/evr094. Epub 2011 Sep 16.

Computational analysis of RNA editing sites in plant mitochondrial genomes reveals similar information content and a sporadic distribution of editing sites.植物线粒体基因组中RNA编辑位点的计算分析揭示了相似的信息含量以及编辑位点的散在分布。

Mol Biol Evol. 2007 Sep;24(9):1971-81. doi: 10.1093/molbev/msm125. Epub 2007 Jun 24.

Molecular Evidence for Functional Divergence and Decay of a Transcription Factor Derived from Whole-Genome Duplication in Arabidopsis thaliana.拟南芥中全基因组复制来源的转录因子功能分化与衰退的分子证据

Plant Physiol. 2015 Aug;168(4):1717-34. doi: 10.1104/pp.15.00689. Epub 2015 Jun 23.

引用本文的文献

Natural variation and improved genome annotation of the emerging biofuel crop field pennycress (Thlaspi arvense).新兴生物燃料作物田芥（Thlaspi arvense）的自然变异和基因组注释的改进。

G3 (Bethesda). 2022 May 30;12(6). doi: 10.1093/g3journal/jkac084.

Gene flow as a simple cause for an excess of high-frequency-derived alleles.基因流作为高频衍生等位基因过量的一个简单原因。

Evol Appl. 2020 Jun 2;13(9):2254-2263. doi: 10.1111/eva.12998. eCollection 2020 Oct.

Genomic variations and distinct evolutionary rate of rare alleles in Arabidopsis thaliana.拟南芥中罕见等位基因的基因组变异与独特进化速率

BMC Evol Biol. 2016 Jan 27;16:25. doi: 10.1186/s12862-016-0590-7.

Recombination is associated with the evolution of genome structure and worker behavior in honey bees.重组与基因组结构和蜜蜂工蜂行为的进化有关。

Proc Natl Acad Sci U S A. 2012 Oct 30;109(44):18012-7. doi: 10.1073/pnas.1208094109. Epub 2012 Oct 15.

Population genomics and local adaptation in wild isolates of a model microbial eukaryote.野生模式微生物真核生物的群体基因组学和局部适应

Proc Natl Acad Sci U S A. 2011 Feb 15;108(7):2831-6. doi: 10.1073/pnas.1014971108. Epub 2011 Jan 31.

本文引用的文献

An approximate bayesian estimator suggests strong, recurrent selective sweeps in Drosophila.一种近似贝叶斯估计方法表明，果蝇中存在强烈的、反复出现的选择性清除现象。

PLoS Genet. 2008 Sep 19;4(9):e1000198. doi: 10.1371/journal.pgen.1000198.

Assessing the evolutionary impact of amino acid mutations in the human genome.评估人类基因组中氨基酸突变的进化影响。

PLoS Genet. 2008 May 30;4(5):e1000083. doi: 10.1371/journal.pgen.1000083.

Demographic history of european populations of Arabidopsis thaliana.拟南芥欧洲种群的人口统计学历史。

PLoS Genet. 2008 May 16;4(5):e1000075. doi: 10.1371/journal.pgen.1000075.

Selection on amino acid substitutions in Arabidopsis.拟南芥中氨基酸替换的选择

Mol Biol Evol. 2008 Jul;25(7):1375-83. doi: 10.1093/molbev/msn079. Epub 2008 Apr 4.

Local patterns of nucleotide polymorphism are highly variable in the selfing species Arabidopsis thaliana.在自花授粉物种拟南芥中，核苷酸多态性的局部模式高度可变。

J Mol Evol. 2008 Feb;66(2):116-29. doi: 10.1007/s00239-007-9063-3. Epub 2008 Feb 14.

Genome-wide patterns of nucleotide polymorphism in domesticated rice.驯化水稻核苷酸多态性的全基因组模式。

PLoS Genet. 2007 Sep;3(9):1745-56. doi: 10.1371/journal.pgen.0030163. Epub 2007 Aug 6.

Common sequence polymorphisms shaping genetic diversity in Arabidopsis thaliana.塑造拟南芥遗传多样性的常见序列多态性。

Science. 2007 Jul 20;317(5836):338-42. doi: 10.1126/science.1138632.

Genome-wide patterns of single-feature polymorphism in Arabidopsis thaliana.拟南芥全基因组单特征多态性模式

Proc Natl Acad Sci U S A. 2007 Jul 17;104(29):12057-62. doi: 10.1073/pnas.0705323104. Epub 2007 Jul 12.

Context dependence, ancestral misidentification, and spurious signatures of natural selection.上下文依赖性、祖先误认与自然选择的虚假特征

Mol Biol Evol. 2007 Aug;24(8):1792-800. doi: 10.1093/molbev/msm108. Epub 2007 Jun 1.

CpG + CpNpG analysis of protein-coding sequences from tomato.番茄蛋白质编码序列的CpG + CpNpG分析

Mol Biol Evol. 2006 Jun;23(6):1318-23. doi: 10.1093/molbev/msk017. Epub 2006 Apr 6.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验