Otto Thomas D, Assefa Sammy A, Böhme Ulrike, Sanders Mandy J, Kwiatkowski Dominic, Berriman Matt, Newbold Chris
Parasite Genetics, Wellcome Trust Sanger Institute, Hinxton, UK.
Institute of Infection, Immunity & Inflammation, MVLS, University of Glasgow, Glasgow, UK.
Wellcome Open Res. 2019 Dec 3;4:193. doi: 10.12688/wellcomeopenres.15590.1. eCollection 2019.
The gene family of the human malaria parasite encode proteins that are crucial determinants of both pathogenesis and immune evasion and are highly polymorphic. Here we have assembled nearly complete gene repertoires from 2398 field isolates and analysed a normalised set of 714 from across 12 countries. This therefore represents the first large scale attempt to catalogue the worldwide distribution of gene sequences We confirm the extreme polymorphism of this gene family but also demonstrate an unexpected level of sequence sharing both within and between continents. We show that this is likely due to both the remnants of selective sweeps as well as a worrying degree of recent gene flow across continents with implications for the spread of drug resistance. We also address the evolution of the repertoire with respect to the ancestral genes within the and show that diversity generated by recombination is concentrated in a number of hotspots. An analysis of the subdomain structure indicates that some existing definitions may need to be revised From the analysis of this data, we can now understand the way in which the family has evolved and how the diversity is continuously being generated. Finally, we demonstrate that because the genes are distributed across the genome, sequence sharing between genotypes acts as a useful population genetic marker.
人类疟原虫的基因家族编码的蛋白质是发病机制和免疫逃避的关键决定因素,且具有高度多态性。在此,我们从2398个野外分离株中组装了近乎完整的基因库,并分析了来自12个国家的714个标准化样本。因此,这代表了首次大规模尝试对该基因序列的全球分布进行编目。我们证实了这个基因家族的极端多态性,但也展示了各大洲内部和各大洲之间意想不到的序列共享水平。我们表明,这可能是由于选择性清除的残余以及近期跨洲基因流动的程度令人担忧,这对耐药性的传播产生了影响。我们还探讨了该基因库相对于该属内祖先基因的进化,并表明重组产生的多样性集中在一些热点区域。对亚结构域结构的分析表明,一些现有的定义可能需要修订。通过对这些数据的分析,我们现在可以了解该家族的进化方式以及多样性是如何持续产生的。最后,我们证明,由于这些基因分布在整个基因组中,基因型之间的序列共享可作为一种有用的群体遗传标记。