Departamento de Parasitologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Geraisgrid.8430.f, Belo Horizonte, Minas Gerais, Brazil.
Department of Biology, University of York, York, Yorkshire, United Kingdom.
mBio. 2022 Dec 20;13(6):e0231922. doi: 10.1128/mbio.02319-22. Epub 2022 Oct 20.
Repetitive elements cause assembly fragmentation in complex eukaryotic genomes, limiting the study of their variability. The genome of Trypanosoma cruzi, the parasite that causes Chagas disease, has a high repetitive content, including multigene families. Although many T. cruzi multigene families encode surface proteins that play pivotal roles in host-parasite interactions, their variability is currently underestimated, as their high repetitive content results in collapsed gene variants. To estimate sequence variability and copy number variation of multigene families, we developed a read-based approach that is independent of gene-specific read mapping and assembly. This methodology was used to estimate the copy number and variability of MASP, TcMUC, and Trans-Sialidase (TS), the three largest T. cruzi multigene families, in 36 strains, including members of all six parasite discrete typing units (DTUs). We found that these three families present a specific pattern of variability and copy number among the distinct parasite DTUs. Inter-DTU hybrid strains presented a higher variability of these families, suggesting that maintaining a larger content of their members could be advantageous. In addition, in a chronic murine model and chronic Chagasic human patients, the immune response was focused on TS antigens, suggesting that targeting TS conserved sequences could be a potential avenue to improve diagnosis and vaccine design against Chagas disease. Finally, the proposed approach can be applied to study multicopy genes in any organism, opening new avenues to access sequence variability in complex genomes. Sequences that have several copies in a genome, such as multicopy-gene families, mobile elements, and microsatellites, are among the most challenging genomic segments to study. They are frequently underestimated in genome assemblies, hampering the correct assessment of these important players in genome evolution and adaptation. Here, we developed a new methodology to estimate variability and copy numbers of repetitive genomic regions and employed it to characterize the T. cruzi multigene families MASP, TcMUC, and transsialidase (TS), which are important virulence factors in this parasite. We showed that multigene families vary in sequence and content among the parasite's lineages, whereas hybrid strains have a higher sequence variability that could be advantageous to the parasite's survivability. By identifying conserved sequences within multigene families, we showed that the mammalian host immune response toward these multigene families is usually focused on the TS multigene family. These TS conserved and immunogenic peptides can be explored in future works as diagnostic targets or vaccine candidates for Chagas disease. Finally, this methodology can be easily applied to any organism of interest, which will aid in our understanding of complex genomic regions.
重复元件导致复杂真核生物基因组组装碎片化,限制了对其变异性的研究。引起恰加斯病的寄生虫克氏锥虫的基因组具有很高的重复含量,包括多基因家族。尽管克氏锥虫的许多多基因家族编码在宿主-寄生虫相互作用中起关键作用的表面蛋白,但它们的变异性目前被低估了,因为它们的高重复含量导致基因变异体崩溃。为了估计多基因家族的序列变异性和拷贝数变异,我们开发了一种基于读取的方法,该方法独立于基因特异性读取映射和组装。该方法用于估计 36 株包括所有 6 种寄生虫离散分型单位 (DTU) 在内的寄生虫中 MASP、TcMUC 和转涎酶 (TS) 这三个最大的克氏锥虫多基因家族的拷贝数和变异性。我们发现,这三个家族在不同寄生虫 DTU 之间表现出特定的变异性和拷贝数模式。间 DTU 杂种菌株表现出这些家族更高的变异性,这表明维持其成员的更大含量可能是有利的。此外,在慢性鼠模型和慢性恰加斯病患者中,免疫反应集中在 TS 抗原上,这表明针对 TS 保守序列可能是改善对恰加斯病的诊断和疫苗设计的潜在途径。最后,所提出的方法可用于研究任何生物体中的多拷贝基因,为研究复杂基因组中的序列变异性开辟了新途径。基因组中具有多个拷贝的序列,如多拷贝基因家族、移动元件和微卫星,是研究最具挑战性的基因组片段之一。在基因组组装中,它们经常被低估,从而阻碍了对这些在基因组进化和适应中重要参与者的正确评估。在这里,我们开发了一种新的方法来估计重复基因组区域的变异性和拷贝数,并将其用于表征克氏锥虫多基因家族 MASP、TcMUC 和转涎酶 (TS),这些基因家族是该寄生虫的重要毒力因子。我们表明,多基因家族在寄生虫的谱系中在序列和内容上有所不同,而杂种菌株具有更高的序列变异性,这对寄生虫的生存能力可能是有利的。通过鉴定多基因家族内的保守序列,我们表明哺乳动物宿主对这些多基因家族的免疫反应通常集中在 TS 多基因家族上。这些 TS 保守和免疫原性肽可在未来的工作中作为恰加斯病的诊断靶标或疫苗候选物进行探索。最后,这种方法可以很容易地应用于任何感兴趣的生物体,这将有助于我们理解复杂的基因组区域。