Department of Pathology, Cambridge University, Tennis Court Road, Cambridge, CB2 1QP, UK.
BMC Genomics. 2020 Mar 17;21(1):236. doi: 10.1186/s12864-020-6625-x.
The Plasmodium genus of malaria parasites encodes several families of antigen-encoding genes. These genes tend to be hyper-variable, highly recombinogenic and variantly expressed. The best-characterized family is the var genes, exclusively found in the Laveranian subgenus of malaria parasites infecting humans and great apes. Var genes encode major virulence factors involved in immune evasion and the maintenance of chronic infections. In the human parasite P. falciparum, var gene recombination and diversification appear to be promoted by G-quadruplex (G4) DNA motifs, which are strongly associated with var genes in P. falciparum. Here, we investigated how this association might have evolved across Plasmodium species - both Laverania and also more distantly related species which lack vars but encode other, more ancient variant gene families.
The association between var genes and G4-forming motifs was conserved across Laverania, spanning ~ 1 million years of evolutionary time, with suggestive evidence for evolution of the association occurring within this subgenus. In rodent malaria species, G4-forming motifs were somewhat associated with pir genes, but this was not conserved in the Laverania, nor did we find a strong association of these motifs with any gene family in a second outgroup of avian malaria parasites. Secondly, we compared two different G4 prediction algorithms in their performance on extremely A/T-rich Plasmodium genomes, and also compared these predictions with experimental data from G4-seq, a DNA sequencing method for identifying G4-forming motifs. We found a surprising lack of concordance between the two algorithms and also between the algorithms and G4-seq data.
G4-forming motifs are uniquely strongly associated with Plasmodium var genes, suggesting a particular role for G4s in recombination and diversification of these genes. Secondly, in the A/T-rich genomes of Plasmodium species, the choice of prediction algorithm may be particularly influential when studying G4s in these important protozoan pathogens.
疟原虫属的疟原虫编码了几大家族的抗原编码基因。这些基因往往具有高度变异性、高度重组性和可变表达性。研究最为透彻的家族是 var 基因,它仅存在于感染人类和大猿的 Laveranian 亚属疟原虫中。var 基因编码主要的毒力因子,参与免疫逃避和慢性感染的维持。在人类寄生虫疟原虫中,var 基因重组和多样化似乎是由 G-四链体 (G4) DNA 基序促进的,这些基序与疟原虫中的 var 基因强烈相关。在这里,我们研究了这种关联在疟原虫属中的多个物种中是如何进化的——既包括 Laverania,也包括更遥远的亲缘关系的物种,这些物种缺乏 vars,但编码其他更古老的变体基因家族。
var 基因与 G4 形成基序的关联在 Laverania 中是保守的,跨越了大约 100 万年的进化时间,有迹象表明这种关联在这个亚属中发生了进化。在啮齿动物疟原虫中,G4 形成基序与 pir 基因有些关联,但这种关联在 Laverania 中并不保守,我们也没有在第二种鸟类疟原虫的外群中发现这些基序与任何基因家族有很强的关联。其次,我们比较了两种不同的 G4 预测算法在处理极度 A/T 丰富的疟原虫基因组时的性能,并将这些预测与 G4-seq 的实验数据进行了比较,G4-seq 是一种用于识别 G4 形成基序的 DNA 测序方法。我们发现这两种算法之间以及算法与 G4-seq 数据之间存在令人惊讶的不一致。
G4 形成基序与疟原虫 var 基因强烈相关,表明 G4 在这些基因的重组和多样化中具有特殊作用。其次,在疟原虫属的 A/T 丰富的基因组中,当研究这些重要的原生动物病原体中的 G4 时,预测算法的选择可能特别有影响。