Filliol Ingrid, Motiwala Alifiya S, Cavatore Magali, Qi Weihong, Hazbón Manzour Hernando, Bobadilla del Valle Miriam, Fyfe Janet, García-García Lourdes, Rastogi Nalin, Sola Christophe, Zozio Thierry, Guerrero Marta Inírida, León Clara Inés, Crabtree Jonathan, Angiuoli Sam, Eisenach Kathleen D, Durmaz Riza, Joloba Moses L, Rendón Adrian, Sifuentes-Osornio José, Ponce de León Alfredo, Cave M Donald, Fleischmann Robert, Whittam Thomas S, Alland David
Division of Infectious Disease, University of Medicine and Dentistry of New Jersey, 185 South Orange Ave., MSB A920C, Newark, NJ 07103.
J Bacteriol. 2006 Jan;188(2):759-72. doi: 10.1128/JB.188.2.759-772.2006.
We analyzed a global collection of Mycobacterium tuberculosis strains using 212 single nucleotide polymorphism (SNP) markers. SNP nucleotide diversity was high (average across all SNPs, 0.19), and 96% of the SNP locus pairs were in complete linkage disequilibrium. Cluster analyses identified six deeply branching, phylogenetically distinct SNP cluster groups (SCGs) and five subgroups. The SCGs were strongly associated with the geographical origin of the M. tuberculosis samples and the birthplace of the human hosts. The most ancestral cluster (SCG-1) predominated in patients from the Indian subcontinent, while SCG-1 and another ancestral cluster (SCG-2) predominated in patients from East Asia, suggesting that M. tuberculosis first arose in the Indian subcontinent and spread worldwide through East Asia. Restricted SCG diversity and the prevalence of less ancestral SCGs in indigenous populations in Uganda and Mexico suggested a more recent introduction of M. tuberculosis into these regions. The East African Indian and Beijing spoligotypes were concordant with SCG-1 and SCG-2, respectively; X and Central Asian spoligotypes were also associated with one SCG or subgroup combination. Other clades had less consistent associations with SCGs. Mycobacterial interspersed repetitive unit (MIRU) analysis provided less robust phylogenetic information, and only 6 of the 12 MIRU microsatellite loci were highly differentiated between SCGs as measured by GST. Finally, an algorithm was devised to identify two minimal sets of either 45 or 6 SNPs that could be used in future investigations to enable global collaborations for studies on evolution, strain differentiation, and biological differences of M. tuberculosis.
我们使用212个单核苷酸多态性(SNP)标记分析了全球范围内收集的结核分枝杆菌菌株。SNP核苷酸多样性很高(所有SNP的平均值为0.19),并且96%的SNP位点对处于完全连锁不平衡状态。聚类分析确定了6个深度分支、系统发育上不同的SNP聚类组(SCG)和5个亚组。SCG与结核分枝杆菌样本的地理来源以及人类宿主的出生地密切相关。最古老的聚类(SCG-1)在来自印度次大陆的患者中占主导地位,而SCG-1和另一个古老聚类(SCG-2)在来自东亚的患者中占主导地位,这表明结核分枝杆菌最初起源于印度次大陆,并通过东亚传播到全球。乌干达和墨西哥土著人群中SCG多样性受限以及较新出现的SCG的流行表明结核分枝杆菌是最近才传入这些地区的。东非印度和北京 spoligotypes分别与SCG-1和SCG-2一致;X和中亚spoligotypes也与一种SCG或亚组组合相关。其他进化枝与SCG的关联不太一致。分枝杆菌散布重复单元(MIRU)分析提供的系统发育信息不太可靠,通过GST测量,12个MIRU微卫星位点中只有6个在SCG之间有高度差异。最后,设计了一种算法来识别两组最小的45个或6个SNP,可用于未来的研究,以促进全球合作开展关于结核分枝杆菌进化、菌株分化和生物学差异的研究。