Sperling Linda, Dessen Philippe, Zagulski Marek, Pearlman Ron E, Migdalski Andrzey, Gromadka Robert, Froissard Marine, Keller Anne-Marie, Cohen Jean
Centre de Génétique Moléculaire, CNRS, 91198 Gif-sur-Yvette Cedex, France.
Eukaryot Cell. 2002 Jun;1(3):341-52. doi: 10.1128/EC.1.3.341-352.2002.
We report a random survey of 1 to 2% of the somatic genome of the free-living ciliate Paramecium tetraurelia by single-run sequencing of the ends of plasmid inserts. As in all ciliates, the germ line genome of Paramecium (100 to 200 Mb) is reproducibly rearranged at each sexual cycle to produce a somatic genome of expressed or potentially expressed genes, stripped of repeated sequences, transposons, and AT-rich unique sequence elements limited to the germ line. We found the somatic genome to be compact (>68% coding, estimated from the sequence of several complete library inserts) and to feature uniformly small introns (18 to 35 nucleotides). This facilitated gene discovery: 722 open reading frames (ORFs) were identified by similarity with known proteins, and 119 novel ORFs were tentatively identified by internal comparison of the data set. We determined the phylogenetic position of Paramecium with respect to eukaryotes whose genomes have been sequenced by the distance matrix neighbor-joining method by using random combined protein data from the project. The unrooted tree obtained is very robust and in excellent agreement with accepted topology, providing strong support for the quality and consistency of the data set. Our study demonstrates that a random survey of the somatic genome of Paramecium is a good strategy for gene discovery in this organism.
我们通过对质粒插入片段末端进行单次测序,对自由生活的纤毛虫四膜虫的体细胞基因组的1%至2%进行了随机调查。与所有纤毛虫一样,四膜虫的生殖系基因组(100至200兆碱基)在每个性周期都会发生可重复的重排,以产生一个由表达的或潜在表达的基因组成的体细胞基因组,该基因组去除了重复序列、转座子以及仅限于生殖系的富含AT的独特序列元件。我们发现体细胞基因组很紧凑(根据几个完整文库插入片段的序列估计,编码区大于68%),并且具有统一的小内含子(18至35个核苷酸)。这有助于基因发现:通过与已知蛋白质的相似性鉴定出722个开放阅读框(ORF),并通过数据集的内部比较初步鉴定出119个新的ORF。我们利用该项目的随机组合蛋白质数据,通过距离矩阵邻接法确定了四膜虫相对于已测序基因组的真核生物的系统发育位置。得到的无根树非常稳健,与公认的拓扑结构高度一致,为数据集的质量和一致性提供了有力支持。我们的研究表明,对四膜虫体细胞基因组进行随机调查是在该生物体中发现基因的一个好策略。