Keely Scott P, Stringer James R
Department of Molecular Genetics, Biochemistry and Microbiology, University of Cincinnati College of Medicine, Cincinnati, Ohio, 45220, USA.
BMC Genomics. 2009 Aug 7;10:367. doi: 10.1186/1471-2164-10-367.
The relationship between the parasitic fungus Pneumocystis carinii and its host, the laboratory rat, presumably involves features that allow the fungus to circumvent attacks by the immune system. It is hypothesized that the major surface glycoprotein (MSG) gene family endows Pneumocystis with the capacity to vary its surface. This gene family is comprised of approximately 80 genes, which each are approximately 3 kb long. Expression of the MSG gene family is regulated by a cis-dependent mechanism that involves a unique telomeric site in the genome called the expression site. Only the MSG gene adjacent to the expression site is represented by messenger RNA. Several P. carinii MSG genes have been sequenced, which showed that genes in the family can encode distinct isoforms of MSG. The vast majority of family members have not been characterized at the sequence level.
The first 300 basepairs of MSG genes were subjected to analysis herein. Analysis of 581 MSG sequence reads from P. carinii genomic DNA yielded 281 different sequences. However, many of the sequence reads differed from others at only one site, a degree of variation consistent with that expected to be caused by error. Accounting for error reduced the number of truly distinct sequences observed to 158, roughly twice the number expected if the gene family contains 80 members. The size of the gene family was verified by PCR. The excess of distinct sequences appeared to be due to allelic variation. Discounting alleles, there were 73 different MSG genes observed. The 73 genes differed by 19% on average. Variable regions were rich in nucleotide differences that changed the encoded protein. The genes shared three regions in which at least 16 consecutive basepairs were invariant. There were numerous cases where two different genes were identical within a region that was variable among family members as a whole, suggesting recombination among family members.
A set of sequences that represents most if not all of the members of the P. carinii MSG gene family was obtained. The protein-changing nature of the variation among these sequences suggests that the family has been shaped by selection for protein variation, which is consistent with the hypothesis that the MSG gene family functions to enhance phenotypic variation among the members of a population of P. carinii.
寄生真菌卡氏肺孢子虫与其宿主实验大鼠之间的关系,可能涉及使该真菌能够规避免疫系统攻击的特征。据推测,主要表面糖蛋白(MSG)基因家族赋予了卡氏肺孢子虫改变其表面的能力。这个基因家族由大约80个基因组成,每个基因长度约为3 kb。MSG基因家族的表达受一种顺式依赖机制调控,该机制涉及基因组中一个独特的端粒位点,称为表达位点。只有与表达位点相邻的MSG基因由信使RNA表示。几个卡氏肺孢子虫MSG基因已被测序,结果表明该家族中的基因可编码不同的MSG同工型。绝大多数家族成员尚未在序列水平上进行表征。
本文对MSG基因的前300个碱基对进行了分析。对来自卡氏肺孢子虫基因组DNA的581条MSG序列读数进行分析,得到了281种不同的序列。然而,许多序列读数彼此之间仅在一个位点上存在差异,这种变异程度与预期的由误差引起的变异程度一致。考虑到误差后,观察到的真正不同序列的数量减少到158个,大约是如果该基因家族包含80个成员时预期数量的两倍。通过PCR验证了基因家族的大小。明显过多的不同序列似乎是由于等位基因变异所致。排除等位基因后,共观察到73个不同的MSG基因。这73个基因平均差异为19%。可变区富含改变编码蛋白质的核苷酸差异。这些基因共有三个区域,其中至少16个连续碱基对是不变的。在许多情况下,两个不同的基因在整个家族成员中可变的一个区域内是相同的,这表明家族成员之间发生了重组。
获得了一组序列,这些序列代表了卡氏肺孢子虫MSG基因家族的大部分(如果不是全部)成员。这些序列变异的蛋白质改变性质表明,该家族是通过对蛋白质变异的选择而形成的,这与MSG基因家族的功能是增强卡氏肺孢子虫群体成员间表型变异的假设一致。