Li Li, Brunk Brian P, Kissinger Jessica C, Pape Deana, Tang Keliang, Cole Robert H, Martin John, Wylie Todd, Dante Mike, Fogarty Steven J, Howe Daniel K, Liberator Paul, Diaz Carmen, Anderson Jennifer, White Michael, Jerome Maria E, Johnson Emily A, Radke Jay A, Stoeckert Christian J, Waterston Robert H, Clifton Sandra W, Roos David S, Sibley L David
Department of Biology, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.
Genome Res. 2003 Mar;13(3):443-54. doi: 10.1101/gr.693203.
Large-scale EST sequencing projects for several important parasites within the phylum Apicomplexa were undertaken for the purpose of gene discovery. Included were several parasites of medical importance (Plasmodium falciparum, Toxoplasma gondii) and others of veterinary importance (Eimeria tenella, Sarcocystis neurona, and Neospora caninum). A total of 55192 ESTs, deposited into dbEST/GenBank, were included in the analyses. The resulting sequences have been clustered into nonredundant gene assemblies and deposited into a relational database that supports a variety of sequence and text searches. This database has been used to compare the gene assemblies using BLAST similarity comparisons to the public protein databases to identify putative genes. Of these new entries, approximately 15%-20% represent putative homologs with a conservative cutoff of p < 10(-9), thus identifying many conserved genes that are likely to share common functions with other well-studied organisms. Gene assemblies were also used to identify strain polymorphisms, examine stage-specific expression, and identify gene families. An interesting class of genes that are confined to members of this phylum and not shared by plants, animals, or fungi, was identified. These genes likely mediate the novel biological features of members of the Apicomplexa and hence offer great potential for biological investigation and as possible therapeutic targets.
为了发现基因,开展了针对顶复门内几种重要寄生虫的大规模EST测序项目。其中包括几种具有医学重要性的寄生虫(恶性疟原虫、刚地弓形虫)以及其他具有兽医重要性的寄生虫(柔嫩艾美耳球虫、犬新孢子虫和肉孢子虫)。分析中纳入了总共55192条存入dbEST/GenBank的EST。所得序列已聚类成非冗余基因组装体,并存入一个支持各种序列和文本搜索的关系数据库。该数据库已用于通过与公共蛋白质数据库进行BLAST相似性比较来比较基因组装体,以鉴定推定基因。在这些新条目中,约15% - 20%代表在保守截止值p < 10(-9)时的推定同源物,从而鉴定出许多可能与其他经过充分研究的生物体具有共同功能的保守基因。基因组装体还用于鉴定菌株多态性、检查阶段特异性表达以及鉴定基因家族。鉴定出了一类有趣的基因,这类基因仅限于该门的成员,植物、动物或真菌并不共享。这些基因可能介导了顶复门成员的新生物学特性,因此在生物学研究和作为可能的治疗靶点方面具有巨大潜力。