Lapp Stacey A, Korir Cindy C, Galinski Mary R
Emory Vaccine Center, Yerkes National Primate Research Center, Emory University, Atlanta, Georgia, USA.
Malar J. 2009 Jul 31;8:181. doi: 10.1186/1475-2875-8-181.
The SICAvar gene family, expressed at the surface of infected erythrocytes, is critical for antigenic variation in Plasmodium knowlesi. When this family was discovered, a prototypic SICAvar gene was characterized and defined by a 10-exon structure. The predicted 205-kDa protein lacked a convincing signal peptide, but included a series of variable cysteine-rich modules, a transmembrane domain encoded by the penultimate exon, and a cytoplasmic domain encoded by the final highly conserved exon. The 205 SICAvar gene and its family with up to 108 possible family members, was identified prior to the sequencing of the P. knowlesi genome. However, in the published P. knowlesi database this gene remains disjointed in five fragments. This study addresses a number of structural and functional questions that are critical for understanding SICAvar gene expression.
Database mining, bioinformatics, and traditional genomic and post-genomic experimental methods including proteomic technologies are used here to confirm the genomic context and expressed structure of the prototype 205 SICAvar gene.
This study reveals that the 205 SICAvar gene reported previously to have a 10-exon expressed gene structure has, in fact, 12 exons, with an unusually large and repeat-laden intron separating two newly defined upstream exons and the bona fide 5'UTR from the remainder of the gene sequence. The initial exon encodes a PEXEL motif, which may function to localize the SICA protein in the infected erythrocyte membrane. This newly defined start of the 205 SICAvar sequence is positioned on chromosome 5, over 340 kb upstream from the rest of the telomerically positioned SICAvar gene sequence in the published genome assembly. This study, however, verifies the continuity of these sequences, a 9.5 kb transcript, and provides evidence that the 205 SICAvar gene is located centrally on chromosome 5.
The prototype 205 SICAvar gene has been redefined to have a 12-exon structure. These data are important because they 1) address questions raised in the P. knowlesi genome database regarding SICAvar gene fragments, numbers and structures, 2) show that this prototype gene encodes a PEXEL motif, 3) emphasize the need for further refinement of the P. knowlesi genome data, and 4) retrospectively, provide evidence for recombination within centrally located SICAvar sequences.
SICAvar基因家族在受感染红细胞表面表达,对诺氏疟原虫的抗原变异至关重要。该家族被发现时,一个原型SICAvar基因通过10个外显子结构得以表征和定义。预测的205-kDa蛋白缺乏令人信服的信号肽,但包含一系列富含半胱氨酸的可变模块、由倒数第二个外显子编码的跨膜结构域以及由最后一个高度保守外显子编码的胞质结构域。在诺氏疟原虫基因组测序之前,就已鉴定出205 SICAvar基因及其家族,该家族可能有多达108个成员。然而,在已发表的诺氏疟原虫数据库中,该基因仍以五个片段的形式分散存在。本研究探讨了一些对于理解SICAvar基因表达至关重要的结构和功能问题。
此处使用数据库挖掘、生物信息学以及包括蛋白质组学技术在内的传统基因组和后基因组实验方法,以确认原型205 SICAvar基因的基因组背景和表达结构。
本研究揭示,先前报道具有10个外显子表达基因结构的205 SICAvar基因实际上有12个外显子,一个异常大且富含重复序列的内含子将两个新定义的上游外显子以及真正的5'UTR与基因序列的其余部分分隔开来。起始外显子编码一个PEXEL基序,其可能起到将SICA蛋白定位在受感染红细胞膜中的作用。205 SICAvar序列的这个新定义起始位置位于5号染色体上,在已发表的基因组组装中,距离端粒定位的SICAvar基因序列的其余部分上游超过340 kb。然而,本研究验证了这些序列的连续性,即一个9.5 kb的转录本,并提供证据表明205 SICAvar基因位于5号染色体的中央位置。
原型205 SICAvar基因已被重新定义为具有12个外显子的结构。这些数据很重要,因为它们:1)解决了诺氏疟原虫基因组数据库中关于SICAvar基因片段、数量和结构的问题;2)表明该原型基因编码一个PEXEL基序;3)强调了进一步完善诺氏疟原虫基因组数据的必要性;4)回顾性地为位于中央的SICAvar序列内的重组提供了证据。