Smithsonian Environmental Research Center, Edgewater, Maryland, USA.
Mol Biol Evol. 2011 Apr;28(4):1469-80. doi: 10.1093/molbev/msq332. Epub 2010 Dec 13.
Dinoflagellates have unique nuclei and intriguing genome characteristics with very high DNA content making complete genome sequencing difficult. In dinoflagellates, many genes are found in multicopy gene families, but the processes involved in the establishment and maintenance of these gene families are poorly understood. Understanding the dynamics of gene family evolution in dinoflagellates requires comparisons at different evolutionary scales. Studies of closely related species provide fine-scale information relative to species divergence, whereas comparisons of more distantly related species provides broad context. We selected the actin gene family as a highly expressed conserved gene previously studied in dinoflagellates. Of the 142 sequences determined in this study, 103 were from the two closely related species, Dinophysis acuminata and D. caudata, including full length and partial cDNA sequences as well as partial genomic amplicons. For these two Dinophysis species, at least three types of sequences could be identified. Most copies (79%) were relatively similar and in nucleotide trees, the sequences formed two bushy clades corresponding to the two species. In comparisons within species, only eight to ten nucleotide differences were found between these copies. The two remaining types formed clades containing sequences from both species. One type included the most similar sequences in between-species comparisons with as few as 12 nucleotide differences between species. The second type included the most divergent sequences in comparisons between and within species with up to 93 nucleotide differences between sequences. In all the sequences, most variation occurred in synonymous sites or the 5' UnTranslated Region (UTR), although there was still limited amino acid variation between most sequences. Several potential pseudogenes were found (approximately 10% of all sequences depending on species) with incomplete open reading frames due to frameshifts or early stop codons. Overall, variation in the actin gene family fits best with the "birth and death" model of evolution based on recent duplications, pseudogenes, and incomplete lineage sorting. Divergence between species was similar to variation within species, so that actin may be too conserved to be useful for phylogenetic estimation of closely related species.
甲藻具有独特的细胞核和引人入胜的基因组特征,其 DNA 含量非常高,这使得完整基因组测序变得困难。在甲藻中,许多基因存在于多拷贝基因家族中,但这些基因家族的建立和维持过程尚不清楚。了解甲藻基因家族进化的动态需要在不同的进化尺度上进行比较。对密切相关物种的研究提供了相对于物种分化的精细尺度信息,而对更遥远相关物种的比较则提供了广泛的背景。我们选择肌动蛋白基因家族作为一个在甲藻中先前研究过的高度表达的保守基因。在本研究中确定的 142 个序列中,有 103 个来自两个密切相关的物种,即沟鞭藻属和沟鞭藻属,包括全长和部分 cDNA 序列以及部分基因组扩增子。对于这两个沟鞭藻物种,至少可以识别出三种类型的序列。大多数副本(79%)相对相似,在核苷酸树中,这些序列形成了两个对应的两个物种的丛状分支。在种内比较中,这些副本之间只发现了 8 到 10 个核苷酸差异。另外两种类型形成了包含来自两个物种的序列的分支。一种类型包括在种间比较中最相似的序列,种间差异最小,只有 12 个核苷酸。第二种类型包括在种间和种内比较中差异最大的序列,序列之间的差异最大可达 93 个核苷酸。在所有序列中,大多数变异发生在同义位点或 5'非翻译区(UTR),尽管大多数序列之间仍然存在有限的氨基酸变异。发现了几个潜在的假基因(取决于物种,约占所有序列的 10%),由于移码或提前终止密码子,它们的开放阅读框不完整。总体而言,肌动蛋白基因家族的变异最符合基于最近重复、假基因和不完全谱系分选的“诞生和死亡”进化模型。物种间的分化与种内变异相似,因此肌动蛋白可能过于保守,无法用于密切相关物种的系统发育估计。