Eisen Jonathan A, Coyne Robert S, Wu Martin, Wu Dongying, Thiagarajan Mathangi, Wortman Jennifer R, Badger Jonathan H, Ren Qinghu, Amedeo Paolo, Jones Kristie M, Tallon Luke J, Delcher Arthur L, Salzberg Steven L, Silva Joana C, Haas Brian J, Majoros William H, Farzad Maryam, Carlton Jane M, Smith Roger K, Garg Jyoti, Pearlman Ronald E, Karrer Kathleen M, Sun Lei, Manning Gerard, Elde Nels C, Turkewitz Aaron P, Asai David J, Wilkes David E, Wang Yufeng, Cai Hong, Collins Kathleen, Stewart B Andrew, Lee Suzanne R, Wilamowska Katarzyna, Weinberg Zasha, Ruzzo Walter L, Wloga Dorota, Gaertig Jacek, Frankel Joseph, Tsao Che-Chia, Gorovsky Martin A, Keeling Patrick J, Waller Ross F, Patron Nicola J, Cherry J Michael, Stover Nicholas A, Krieger Cynthia J, del Toro Christina, Ryder Hilary F, Williamson Sondra C, Barbeau Rebecca A, Hamilton Eileen P, Orias Eduardo
The Institute for Genomic Research, Rockville, Maryland, United States of America.
PLoS Biol. 2006 Sep;4(9):e286. doi: 10.1371/journal.pbio.0040286.
The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct nuclei within a single cell. The germline-like micronucleus (MIC) has its genome held in reserve for sexual reproduction. The soma-like macronucleus (MAC), which possesses a genome processed from that of the MIC, is the center of gene expression and does not directly contribute DNA to sexual progeny. We report here the shotgun sequencing, assembly, and analysis of the MAC genome of T. thermophila, which is approximately 104 Mb in length and composed of approximately 225 chromosomes. Overall, the gene set is robust, with more than 27,000 predicted protein-coding genes, 15,000 of which have strong matches to genes in other organisms. The functional diversity encoded by these genes is substantial and reflects the complexity of processes required for a free-living, predatory, single-celled organism. This is highlighted by the abundance of lineage-specific duplications of genes with predicted roles in sensing and responding to environmental conditions (e.g., kinases), using diverse resources (e.g., proteases and transporters), and generating structural complexity (e.g., kinesins and dyneins). In contrast to the other lineages of alveolates (apicomplexans and dinoflagellates), no compelling evidence could be found for plastid-derived genes in the genome. UGA, the only T. thermophila stop codon, is used in some genes to encode selenocysteine, thus making this organism the first known with the potential to translate all 64 codons in nuclear genes into amino acids. We present genomic evidence supporting the hypothesis that the excision of DNA from the MIC to generate the MAC specifically targets foreign DNA as a form of genome self-defense. The combination of the genome sequence, the functional diversity encoded therein, and the presence of some pathways missing from other model organisms makes T. thermophila an ideal model for functional genomic studies to address biological, biomedical, and biotechnological questions of fundamental importance.
嗜热四膜虫是分子和细胞生物学的模式生物。与其他纤毛虫一样,该物种具有独立的种系和体细胞功能,由单个细胞内不同的细胞核体现。类种系的小核(MIC)保留其基因组用于有性繁殖。体细胞样的大核(MAC)拥有从MIC基因组加工而来的基因组,是基因表达的中心,不直接为有性后代贡献DNA。我们在此报告嗜热四膜虫MAC基因组的鸟枪法测序、组装和分析,其长度约为104 Mb,由约225条染色体组成。总体而言,基因集强大,有超过27000个预测的蛋白质编码基因,其中15000个与其他生物的基因有强匹配。这些基因编码的功能多样性丰富,反映了自由生活、捕食性单细胞生物所需过程的复杂性。在感知和响应环境条件(如激酶)、利用多种资源(如蛋白酶和转运蛋白)以及产生结构复杂性(如驱动蛋白和动力蛋白)方面具有预测作用的基因大量的谱系特异性重复突出了这一点。与其他顶复门(顶复体虫和甲藻)谱系不同,在基因组中未发现有力证据证明存在质体衍生基因。UGA是嗜热四膜虫唯一的终止密码子,在一些基因中用于编码硒代半胱氨酸,因此使该生物成为已知的首个有可能将核基因中的所有64个密码子都翻译成氨基酸的生物。我们提供了基因组证据支持以下假设:从MIC切除DNA以产生MAC时会特异性靶向外来DNA作为基因组自我防御的一种形式。基因组序列、其中编码的功能多样性以及其他模式生物中缺失的一些途径的存在,使得嗜热四膜虫成为解决具有根本重要性的生物学、生物医学和生物技术问题的功能基因组学研究的理想模式生物。