鉴定拟南芥、杨树、葡萄和水稻中的共享单拷贝核基因及其在不同分类水平上的系统发育应用。

Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels.

机构信息

Department of Biology and the Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA 16802, USA.

出版信息

BMC Evol Biol. 2010 Feb 24;10:61. doi: 10.1186/1471-2148-10-61.

DOI:10.1186/1471-2148-10-61

PMID:20181251

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2848037/

Abstract

BACKGROUND

Although the overwhelming majority of genes found in angiosperms are members of gene families, and both gene- and genome-duplication are pervasive forces in plant genomes, some genes are sufficiently distinct from all other genes in a genome that they can be operationally defined as 'single copy'. Using the gene clustering algorithm MCL-tribe, we have identified a set of 959 single copy genes that are shared single copy genes in the genomes of Arabidopsis thaliana, Populus trichocarpa, Vitis vinifera and Oryza sativa. To characterize these genes, we have performed a number of analyses examining GO annotations, coding sequence length, number of exons, number of domains, presence in distant lineages, such as Selaginella and Physcomitrella, and phylogenetic analysis to estimate copy number in other seed plants and to demonstrate their phylogenetic utility. We then provide examples of how these genes may be used in phylogenetic analyses to reconstruct organismal history, both by using extant coverage in EST databases for seed plants and de novo amplification via RT-PCR in the family Brassicaceae.

RESULTS

There are 959 single copy nuclear genes shared in Arabidopsis, Populus, Vitis and Oryza ["APVO SSC genes"]. The majority of these genes are also present in the Selaginella and Physcomitrella genomes. Public EST sets for 197 species suggest that most of these genes are present across a diverse collection of seed plants, and appear to exist as single or very low copy genes, though exceptions are seen in recently polyploid taxa and in lineages where there is significant evidence for a shared large-scale duplication event. Genes encoding proteins localized in organelles are more commonly single copy than expected by chance, but the evolutionary forces responsible for this bias are unknown.Regardless of the evolutionary mechanisms responsible for the large number of shared single copy genes in diverse flowering plant lineages, these genes are valuable for phylogenetic and comparative analyses. Eighteen of the APVO SSC single copy genes were amplified in the Brassicaceae using RT-PCR and directly sequenced. Alignments of these sequences provide improved resolution of Brassicaceae phylogeny compared to recent studies using plastid and ITS sequences. An analysis of sequences from 13 APVO SSC genes from 69 species of seed plants, derived mainly from public EST databases, yielded a phylogeny that was largely congruent with prior hypotheses based on multiple plastid sequences. Whereas single gene phylogenies that rely on EST sequences have limited bootstrap support as the result of limited sequence information, concatenated alignments result in phylogenetic trees with strong bootstrap support for already established relationships. Overall, these single copy nuclear genes are promising markers for phylogenetics, and contain a greater proportion of phylogenetically-informative sites than commonly used protein-coding sequences from the plastid or mitochondrial genomes.

CONCLUSIONS

Putatively orthologous, shared single copy nuclear genes provide a vast source of new evidence for plant phylogenetics, genome mapping, and other applications, as well as a substantial class of genes for which functional characterization is needed. Preliminary evidence indicates that many of the shared single copy nuclear genes identified in this study may be well suited as markers for addressing phylogenetic hypotheses at a variety of taxonomic levels.

摘要

背景

尽管被子植物中绝大多数基因都是基因家族的成员，且基因和基因组加倍是植物基因组中普遍存在的力量，但有些基因在基因组中与其他所有基因都有足够的区别，因此可以操作地定义为“单拷贝”。使用基因聚类算法 MCL-tribe，我们确定了一组 959 个单拷贝基因，它们是拟南芥、杨树、葡萄和水稻基因组中的共享单拷贝基因。为了描述这些基因，我们进行了多项分析，包括 GO 注释、编码序列长度、外显子数量、结构域数量、在距离较远的谱系（如卷柏和藓类植物）中的存在情况，以及在其他种子植物中的系统发生分析，以估计它们在其他种子植物中的拷贝数，并证明它们在系统发生学上的有用性。然后，我们提供了一些例子，说明如何在系统发生分析中使用这些基因来重建生物的历史，包括使用种子植物的 EST 数据库中的现有覆盖率和通过 RT-PCR 在十字花科家族中进行从头扩增。

结果

有 959 个单拷贝核基因在拟南芥、杨树、葡萄和水稻中共享[APVO SSC 基因]。这些基因中的大多数也存在于卷柏和藓类植物的基因组中。197 个物种的公共 EST 集表明，这些基因大多数存在于多样化的种子植物中，并且似乎作为单拷贝或低拷贝基因存在，尽管在最近的多倍体分类群和存在大规模共享复制事件的谱系中存在例外。定位于细胞器的蛋白质编码基因比预期的随机单拷贝更常见，但导致这种偏差的进化力量尚不清楚。无论导致不同开花植物谱系中大量共享单拷贝基因的进化机制是什么，这些基因对于系统发生和比较分析都是有价值的。使用 RT-PCR 在十字花科中扩增了 18 个 APVO SSC 单拷贝基因，并直接测序。与最近使用质体和 ITS 序列进行的研究相比，这些序列的比对提供了对十字花科系统发育的更好分辨率。对来自 69 个种子植物物种的 13 个 APVO SSC 单拷贝基因序列的分析表明，基于多个质体序列的系统发育假说与基于多个质体序列的系统发育假说基本一致。由于 EST 序列的信息量有限，基于单个基因的系统发育分析的支持度有限，而串联比对则产生了具有很强支持度的系统发育树，用于已经建立的关系。总的来说，这些单拷贝核基因是系统发生学的有前途的标记，并且包含比质体或线粒体基因组中常用的蛋白质编码序列更多的系统发育信息位点。

结论

假定的直系同源、共享单拷贝核基因为植物系统发生学、基因组图谱和其他应用提供了大量新的证据，同时也为需要功能表征的大量基因提供了证据。初步证据表明，本研究中鉴定的许多共享单拷贝核基因可能非常适合作为解决各种分类水平上的系统发育假设的标记。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/da69/2848037/b9f024fa101b/1471-2148-10-61-1.jpg

相似文献

Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels.鉴定拟南芥、杨树、葡萄和水稻中的共享单拷贝核基因及其在不同分类水平上的系统发育应用。

BMC Evol Biol. 2010 Feb 24;10:61. doi: 10.1186/1471-2148-10-61.

Comparative genomic analysis of the WRKY III gene family in populus, grape, arabidopsis and rice.杨树、葡萄、拟南芥和水稻中WRKY III基因家族的比较基因组分析。

Biol Direct. 2015 Sep 8;10:48. doi: 10.1186/s13062-015-0076-3.

Analyses of the oligopeptide transporter gene family in poplar and grape.杨树和葡萄中的寡肽转运基因家族分析。

BMC Genomics. 2011 Sep 26;12:465. doi: 10.1186/1471-2164-12-465.

Analyses of phylogeny, evolution, conserved sequences and genome-wide expression of the ICK/KRP family of plant CDK inhibitors.植物 CDK 抑制剂 ICK/KRP 家族的系统发育、进化、保守序列和全基因组表达分析。

Ann Bot. 2011 May;107(7):1141-57. doi: 10.1093/aob/mcr034. Epub 2011 Mar 7.

Genome-wide analysis of LIM gene family in Populus trichocarpa, Arabidopsis thaliana, and Oryza sativa.毛果杨、拟南芥和水稻中LIM基因家族的全基因组分析。

DNA Res. 2007 Jun 30;14(3):103-16. doi: 10.1093/dnares/dsm013. Epub 2007 Jun 15.

Phylogenetic analyses of Vitis (Vitaceae) based on complete chloroplast genome sequences: effects of taxon sampling and phylogenetic methods on resolving relationships among rosids.基于完整叶绿体基因组序列的葡萄属（葡萄科）系统发育分析：分类群抽样和系统发育方法对解决蔷薇类植物间关系的影响

BMC Evol Biol. 2006 Apr 9;6:32. doi: 10.1186/1471-2148-6-32.

Combining bioinformatics and phylogenetics to identify large sets of single-copy orthologous genes (COSII) for comparative, evolutionary and systematic studies: a test case in the euasterid plant clade.结合生物信息学和系统发育学来鉴定用于比较、进化和系统研究的大量单拷贝直系同源基因（COSII）：真菊分支植物的一个测试案例

Genetics. 2006 Nov;174(3):1407-20. doi: 10.1534/genetics.106.062455. Epub 2006 Sep 1.

Genome-wide and molecular evolution analyses of the phospholipase D gene family in Poplar and Grape.杨树和葡萄中磷脂酶 D 基因家族的全基因组和分子进化分析。

BMC Plant Biol. 2010 Jun 18;10:117. doi: 10.1186/1471-2229-10-117.

The impact of outgroup choice and missing data on major seed plant phylogenetics using genome-wide EST data.利用全基因组EST数据研究外类群选择和缺失数据对主要种子植物系统发育学的影响。

PLoS One. 2009 Jun 2;4(6):e5764. doi: 10.1371/journal.pone.0005764.

Structural characterization and duplication modes of pseudogenes in plants.植物假基因的结构特征和复制模式。

Sci Rep. 2021 Mar 5;11(1):5292. doi: 10.1038/s41598-021-84778-6.

引用本文的文献

A chromosome-level genome assembly of the varied leaved jewelflower, Streptanthus diversifolius, reveals a recent whole genome duplication.多变叶珠宝花（Streptanthus diversifolius）的染色体水平基因组组装揭示了近期的全基因组复制事件。

G3 (Bethesda). 2025 Apr 17;15(4). doi: 10.1093/g3journal/jkaf022.

Cytonuclear evolution in fully heterotrophic plants: lifestyles and gene function determine scenarios.完全异养植物中的胞质-核进化：生活方式和基因功能决定了进化情景。

BMC Plant Biol. 2024 Oct 21;24(1):989. doi: 10.1186/s12870-024-05702-4.

Developing Asparagaceae1726: An Asparagaceae-specific probe set targeting 1726 loci for Hyb-Seq and phylogenomics in the family.天门冬科1726的开发：一套针对天门冬科1726个位点的特异性探针组，用于该科的杂交测序和系统发育基因组学研究。

Appl Plant Sci. 2024 Jun 18;12(5):e11597. doi: 10.1002/aps3.11597. eCollection 2024 Sep-Oct.

Evolution of in the angiosperms: sequence, splicing, and expression in a clade of early transitional mycoheterotrophic orchids.被子植物中[具体内容缺失]的进化：早期过渡性菌根异养兰花类群中的序列、剪接及表达

Front Plant Sci. 2024 Jun 28;15:1241515. doi: 10.3389/fpls.2024.1241515. eCollection 2024.

Development of a Target Enrichment Probe Set for Conifer (REMcon).用于针叶树的目标富集探针集（REMcon）的开发。

Biology (Basel). 2024 May 22;13(6):361. doi: 10.3390/biology13060361.

Genomic decoding of Theobroma grandiflorum (cupuassu) at chromosomal scale: evolutionary insights for horticultural innovation.基因组解码大花可可（cupuassu）：园艺创新的进化见解。

Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae027.

Validation of a simplified small-scale DNA extraction protocol from wine by quantitative real-time PCR.通过定量实时聚合酶链反应验证一种简化的从葡萄酒中提取小规模DNA的方法。

3 Biotech. 2024 May;14(5):145. doi: 10.1007/s13205-024-03992-x. Epub 2024 May 2.

Overexpression of transcription factor accelerates vegetative development in .转录因子的过表达加速了……中的营养生长。（注：原文中“in”后面缺少具体内容）

Front Plant Sci. 2024 Mar 20;15:1327152. doi: 10.3389/fpls.2024.1327152. eCollection 2024.

Genome copy number predicts extreme evolutionary rate variation in plant mitochondrial DNA.基因组拷贝数预测了植物线粒体 DNA 的极端进化速率变化。

Proc Natl Acad Sci U S A. 2024 Mar 5;121(10):e2317240121. doi: 10.1073/pnas.2317240121. Epub 2024 Mar 1.

One hundred single-copy nuclear sequence markers for olive variety identification: a case of fingerprinting database construction in China.用于橄榄品种鉴定的100个单拷贝核序列标记：以中国指纹图谱数据库建设为例

Mol Breed. 2023 Nov 27;43(12):86. doi: 10.1007/s11032-023-01434-9. eCollection 2023 Dec.

本文引用的文献

Phylogeny of Capparaceae and Brassicaceae based on chloroplast sequence data.基于叶绿体序列数据的紫堇科和十字花科的系统发育。

Am J Bot. 2002 Nov;89(11):1826-42. doi: 10.3732/ajb.89.11.1826.

Angiosperm phylogeny based on matK sequence information.基于matK序列信息的被子植物系统发育

Am J Bot. 2003 Dec;90(12):1758-76. doi: 10.3732/ajb.90.12.1758.

Brassicaceae phylogeny and trichome evolution.芸薹科的系统发育和毛状体演化。

Am J Bot. 2006 Apr;93(4):607-19. doi: 10.3732/ajb.93.4.607.

Evolutionary relationships among Pinus (Pinaceae) subsections inferred from multiple low-copy nuclear loci.基于多个低拷贝核基因座推断松属（松科）亚组间的进化关系。

Am J Bot. 2005 Dec;92(12):2086-100. doi: 10.3732/ajb.92.12.2086.

Gene and genome duplications: the impact of dosage-sensitivity on the fate of nuclear genes.基因和基因组复制：剂量敏感性对核基因命运的影响。

Chromosome Res. 2009;17(5):699-717. doi: 10.1007/s10577-009-9055-9.

The pentatricopeptide repeat (PPR) gene family, a tremendous resource for plant phylogenetic studies.五肽重复序列（PPR）基因家族，是植物系统发育研究的巨大资源。

New Phytol. 2009;182(1):272-283. doi: 10.1111/j.1469-8137.2008.02739.x. Epub 2009 Jan 13.

Rates of molecular evolution are linked to life history in flowering plants.开花植物的分子进化速率与生活史相关。

Science. 2008 Oct 3;322(5898):86-9. doi: 10.1126/science.1163197.

The evolutionary position of subfunctionalization, downgraded.亚功能化的进化地位被降低。

Genome Dyn. 2008;4:25-40. doi: 10.1159/000126004.

Cellular reactions to gene dosage imbalance: genomic, transcriptomic and proteomic effects.细胞对基因剂量失衡的反应：基因组、转录组和蛋白质组效应。

Trends Genet. 2008 Aug;24(8):390-7. doi: 10.1016/j.tig.2008.05.005. Epub 2008 Jun 26.

The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus).转基因热带果树番木瓜（番木瓜林奈）的基因组草图。

Nature. 2008 Apr 24;452(7190):991-6. doi: 10.1038/nature06856.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

鉴定拟南芥、杨树、葡萄和水稻中的共享单拷贝核基因及其在不同分类水平上的系统发育应用。

Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献