Aklilu Lemma Institute of Pathobiology, Addis Ababa University, Addis Ababa, Ethiopia.
Collage of Natural and Computational Science, Wallaga University, Nekemte, Ethiopia.
PLoS One. 2024 Jul 25;19(7):e0304060. doi: 10.1371/journal.pone.0304060. eCollection 2024.
The lineage 4 (L4) of Mycobacterium tuberculosis (MTB) is not only globally prevalent but also locally dominant, surpassing other lineages, with lineage 2 (L2) following in prevalence. Despite its widespread occurrence, factors influencing the expansion of L4 and its sub-lineages remain poorly understood both at local and global levels. Therefore, this study aimed to conduct a pan-genome and identify genomic signatures linked to the elevated prevalence of L4 sublineages among extrapulmonary TB (EPTB) patients in western Ethiopia.
A cross-sectional study was conducted at an institutional level involving confirmed cases of extrapulmonary tuberculosis (EPTB) patients from August 5, 2018, to December 30, 2019. A total of 75 MTB genomes, classified under lineage 4 (L4), were used for conducting pan-genome and genome-wide association study (GWAS) analyses. After a quality check, variants were identified using MTBseq, and genomes were de novo assembled using SPAdes. Gene prediction and annotation were performed using Prokka. The pan-genome was constructed using GET_HOMOLOGUES, and its functional analysis was carried out with the Bacterial Pan-Genome Analysis tool (BPGA). For GWAS analysis, Scoary was employed with Benjamini-Hochberg correction, with a significance threshold set at p-value ≤ 0.05.
The analysis revealed a total of 3,270 core genes, predominantly associated with orthologous groups (COG) functions, notably in the categories of '[R] General function prediction only' and '[I] Lipid transport and metabolism'. Conversely, functions related to '[N] Cell motility' and '[Q] Secondary metabolites biosynthesis, transport, and catabolism' were primarily linked to unique and accessory genes. The pan-genome of MTB L4 was found to be open. Furthermore, the GWAS study identified genomic signatures linked to the prevalence of sublineages L4.6.3 and L4.2.2.2.
Apart from host and environmental factors, the sublineage of L4 employs distinct virulence factors for successful dissemination in western Ethiopia. Given that the functions of these newly identified genes are not well understood, it is advisable to experimentally validate their roles, particularly in the successful transmission of specific L4 sublineages over others.
分枝杆菌 4 系(L4)不仅在全球范围内普遍存在,而且在当地也占据主导地位,超过了其他谱系,其次是谱系 2(L2)。尽管 L4 及其亚谱系广泛存在,但无论是在当地还是全球范围内,影响 L4 扩张及其亚谱系的因素仍知之甚少。因此,本研究旨在进行全基因组分析,并确定与埃塞俄比亚西部肺外结核(EPTB)患者中 L4 亚谱系高发相关的基因组特征。
这是一项在机构层面进行的横断面研究,涉及 2018 年 8 月 5 日至 2019 年 12 月 30 日期间确诊的肺外结核(EPTB)患者。使用属于 L4 的 75 个分枝杆菌基因组进行全基因组和全基因组关联研究(GWAS)分析。经过质量检查后,使用 MTBseq 识别变体,使用 SPAdes 从头组装基因组。使用 Prokka 进行基因预测和注释。使用 GET_HOMOLOGUES 构建全基因组,使用细菌全基因组分析工具(BPGA)进行功能分析。GWAS 分析使用 Scoary 进行,采用 Benjamini-Hochberg 校正,显著性阈值设为 p 值≤0.05。
分析共发现 3270 个核心基因,主要与直系同源群(COG)功能相关,特别是在“[R] 仅一般功能预测”和“[I] 脂质转运和代谢”类别中。相反,与“[N] 细胞运动”和“[Q] 次生代谢物生物合成、转运和分解”相关的功能主要与独特和辅助基因相关。L4 的全基因组被发现是开放的。此外,GWAS 研究确定了与 L4.6.3 和 L4.2.2.2 亚谱系流行相关的基因组特征。
除了宿主和环境因素外,L4 亚谱系还采用了不同的毒力因子,以便在埃塞俄比亚西部成功传播。鉴于这些新鉴定基因的功能尚未得到充分了解,建议通过实验验证其作用,特别是在特定 L4 亚谱系相对于其他谱系的成功传播方面。