Bachvaroff Tsvetan R, Place Allen R
Center of Marine Biotechnology, University of Maryland Biotechnology Institute, Baltimore, Maryland, United States of America.
PLoS One. 2008 Aug 13;3(8):e2929. doi: 10.1371/journal.pone.0002929.
Dinoflagellate genomes present unique challenges including large size, modified DNA bases, lack of nucleosomes, and condensed chromosomes. EST sequencing has shown that many genes are found as many slightly different variants implying that many copies are present in the genome. As a preliminary survey of the genome our goal was to obtain genomic sequences for 47 genes from the dinoflagellate Amphidinium carterae. A PCR approach was used to avoid problems with large insert libraries. One primer set was oriented inward to amplify the genomic complement of the cDNA and a second primer set would amplify outward between tandem repeats of the same gene. Each gene was also tested for a spliced leader using cDNA as template. Almost all (14/15) of the highly expressed genes (i.e. those with high representation in the cDNA pool) were shown to be in tandem arrays with short intergenic spacers, and most were trans-spliced. Only two moderately expressed genes were found in tandem arrays. A polyadenylation signal was found in genomic copies containing the sequence AAAAG/C at the exact polyadenylation site and was conserved between species. Four genes were found to have a high intron density (>5 introns) while most either lacked introns, or had only one to three. Actin was selected for deeper sequencing of both genomic and cDNA copies. Two clusters of actin copies were found, separated from each other by many non-coding features such as intron size and sequence. One intron-rich gene was selected for genomic walking using inverse PCR, and was not shown to be in a tandem repeat. The first glimpse of dinoflagellate genome indicates two general categories of genes in dinoflagellates, a highly expressed tandem repeat class and an intron rich less expressed class. This combination of features appears to be unique among eukaryotes.
甲藻基因组存在独特的挑战,包括基因组规模大、DNA碱基修饰、缺乏核小体以及染色体浓缩。EST测序表明,许多基因存在多种略有差异的变体,这意味着基因组中存在多个拷贝。作为对该基因组的初步调查,我们的目标是从甲藻卡特亚扁藻(Amphidinium carterae)中获取47个基因的基因组序列。采用PCR方法以避免大插入片段文库带来的问题。一组引物向内定向以扩增cDNA的基因组互补序列,另一组引物向外扩增同一基因串联重复序列之间的片段。每个基因还用cDNA作为模板检测了剪接前导序列。几乎所有(14/15)高表达基因(即在cDNA文库中高丰度的基因)都显示处于具有短基因间隔区的串联阵列中,并且大多数是反式剪接的。仅发现两个中度表达的基因处于串联阵列中。在基因组拷贝中,在精确的聚腺苷酸化位点发现了包含序列AAAAG/C的聚腺苷酸化信号,并且在物种间是保守的。发现四个基因具有高内含子密度(>5个内含子),而大多数基因要么没有内含子,要么只有一到三个内含子。选择肌动蛋白进行基因组和cDNA拷贝的深度测序。发现了两簇肌动蛋白拷贝,它们被许多非编码特征(如内含子大小和序列)彼此分开。选择一个富含内含子的基因使用反向PCR进行基因组步移,结果表明它不处于串联重复序列中。对甲藻基因组的初步观察表明,甲藻基因有两大类,一类是高表达的串联重复基因类,另一类是富含内含子的低表达基因类。这种特征组合在真核生物中似乎是独特的。