Lawton Jonathan G, Zhou Albert E, Stucke Emily M, Takala-Harrison Shannon, Silva Joana C, Travassos Mark A
Center for Vaccine Development and Global Health, University of Maryland School of Medicine, 685 West Baltimore Street Room 480, Baltimore, MD 21201, USA.
Institute for Genome Sciences, University of Maryland School of Medicine, 670 West Baltimore Street 3rd Floor, Baltimore, MD 21201, USA; Global Health and Tropical Medicine (GHTM), Instituto de Higiene e Medicina Tropical (IHMT), Universidade NOVA de Lisboa (NOVA), 100 Rua da Junqueira, Lisbon 1349-008, Portugal.
Infect Genet Evol. 2025 Apr;129:105725. doi: 10.1016/j.meegid.2025.105725. Epub 2025 Feb 5.
The repetitive interspersed family (rif) and subtelomeric variable open reading frames (stevor) are highly diverse multi-gene families in the malaria parasite Plasmodium falciparum. Embedded on the surface of infected erythrocytes, RIFIN and STEVOR proteins are involved in cytoadherence and immune evasion, but the extent of family-wide sequence diversity across strains has yet to be comprehensively investigated in light of improved resolution of the subtelomeric genome sequences. Using a k-mer frequency approach, we analyzed long-read genomic sequence data from 18 geographically diverse P. falciparum genome assemblies, including lab strains and clinical isolates. We hypothesized that k-mer sequence comparison can identify existing RIFIN and STEVOR subgroups, identify novel subgroups, and generate more robust and reliable estimates of family-wide sequence diversity. Full-length RIFIN and STEVOR proteins shared on average 49.5% and 61.1% amino acid k-mer similarity, respectively, which fell to 25.1% and 20% in the hypervariable regions alone. Despite this diversity, we identified 11 RIFINs and five STEVORs that were conserved across strains above expected thresholds. A subset of these strain-transcendent genes was similar and syntenic to genes in related Plasmodium species, suggesting an ancient origin. Additionally, in silico structural predictions from AlphaFold showed that three-dimensional structures of RIFIN receptor-binding regions were more conserved than their sequences suggested. Evolutionarily constrained RIFINs and STEVORs may have critical functions in parasite survival or pathogenesis. This study provides a framework for investigating diversity in highly variable multi-gene families and highlights the potential of strain-transcendent RIFIN and STEVOR proteins as vaccine candidates.
重复散布家族(rif)和端粒可变开放阅读框(stevor)是恶性疟原虫中高度多样化的多基因家族。RIFIN和STEVOR蛋白嵌入受感染红细胞的表面,参与细胞黏附和免疫逃避,但鉴于端粒基因组序列分辨率的提高,尚未对各菌株间全家族范围的序列多样性进行全面研究。我们使用k-mer频率方法,分析了来自18个地理分布多样的恶性疟原虫基因组组装体的长读长基因组序列数据,包括实验室菌株和临床分离株。我们假设k-mer序列比较可以识别现有的RIFIN和STEVOR亚组,识别新的亚组,并对全家族范围的序列多样性产生更稳健可靠的估计。全长RIFIN和STEVOR蛋白的氨基酸k-mer平均相似度分别为49.5%和61.1%,仅在高变区就降至25.1%和20%。尽管存在这种多样性,我们仍鉴定出11个RIFIN和5个STEVOR在各菌株间的保守程度超过预期阈值。这些跨菌株基因的一个子集与相关疟原虫物种中的基因相似且同线,表明其起源古老。此外,来自AlphaFold的计算机结构预测表明,RIFIN受体结合区域的三维结构比其序列显示的更为保守。进化上受限的RIFIN和STEVOR可能在寄生虫生存或发病机制中具有关键功能。本研究为研究高度可变多基因家族的多样性提供了一个框架,并突出了跨菌株RIFIN和STEVOR蛋白作为疫苗候选物的潜力。