Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.
Protein Pept Lett. 2020;27(8):711-717. doi: 10.2174/0929866527666200313113157.
Tuberculosis (TB), caused by Mycobacterium tuberculosis (Mtb), is one of the oldest known and most dangerous diseases. Although the spread of TB was controlled in the early 20th century using antibiotics and vaccines, TB has again become a threat because of increased drug resistance. There is still a lack of effective treatment regimens for a person who is already infected with multidrug-resistant Mtb (MDR-Mtb) or extensively drug-resistant Mtb (XDRMtb). In the past decades, many research groups have explored the drug resistance profiles of Mtb based on sequence data by GWAS, which identified some mutations that were significantly linked with drug resistance, and attempted to explain the resistance mechanisms. However, they mainly focused on several significant mutations in drug targets (e.g. rpoB, katG). Some genes which are potentially associated with drug resistance may be overlooked by the GWAS analysis.
In this article, our motivation is to detect potential drug resistance genes of Mtb using a heat diffusion model.
All sequencing data, which contained 127 samples of Mtb, i.e. 34 ethambutol-, 65 isoniazid-, 53 rifampicin- and 45 streptomycin-resistant strains. The raw sequence data were preprocessed using Trimmomatic software and aligned to the Mtb H37Rv reference genome using Bowtie2. From the resulting alignments, SAMtools and VarScan were used to filter sequences and call SNPs. The GWAS was performed by the PLINK package to obtain the significant SNPs, which were mapped to genes. The P-values of genes calculated by GWAS were transferred into a heat vector. The heat vector and the Mtb protein-protein interactions (PPI) derived from the STRING database were inputted into the heat diffusion model to obtain significant subnetworks by HotNet2. Finally, the most significant (P < 0.05) subnetworks associated with different phenotypes were obtained. To verify the change of binding energy between the drug and target before and after mutation, the method of molecular dynamics simulation was performed using the AMBER software.
We identified significant subnetworks in rifampicin-resistant samples. Excitingly, we found rpoB and rpoC, which are drug targets of rifampicin. From the protein structure of rpoB, the mutation location was extremely close to the drug binding site, with a distance of only 3.97 Å. Molecular dynamics simulation revealed that the binding energy of rpoB and rifampicin decreased after D435V mutation. To a large extent, this mutation can influence the affinity of drug-target binding. In addition, topA and pyrG were reported to be linked with drug resistance, and might be new TB drug targets. Other genes that have not yet been reported are worth further study.
Using a heat diffusion model in combination with GWAS results and protein-protein interactions, the significantly mutated subnetworks in rifampicin-resistant samples were found. The subnetwork not only contained the known targets of rifampicin (rpoB, rpoC), but also included topA and pyrG, which are potentially associated with drug resistance. Together, these results offer deeper insights into drug resistance of Mtb, and provides potential drug targets for finding new antituberculosis drugs.
结核病(TB)是由结核分枝杆菌(Mtb)引起的,是已知最古老和最危险的疾病之一。尽管在 20 世纪早期使用抗生素和疫苗控制了 TB 的传播,但由于耐药性的增加,TB 再次成为一个威胁。对于已经感染了耐多药结核分枝杆菌(MDR-Mtb)或广泛耐药结核分枝杆菌(XDRMtb)的人,仍然缺乏有效的治疗方案。在过去的几十年中,许多研究小组通过全基因组关联研究(GWAS)基于序列数据探索了 Mtb 的耐药性谱,确定了一些与耐药性显著相关的突变,并试图解释耐药机制。然而,它们主要集中在药物靶点的几个显著突变上(例如 rpoB、katG)。GWAS 分析可能会忽略一些与耐药性相关的潜在基因。
在本文中,我们的动机是使用热扩散模型检测 Mtb 的潜在耐药基因。
所有测序数据包含 127 株 Mtb 样本,即 34 株乙胺丁醇、65 株异烟肼、53 株利福平、45 株链霉素耐药株。使用 Trimmomatic 软件对原始序列数据进行预处理,并使用 Bowtie2 将其与 Mtb H37Rv 参考基因组对齐。从得到的比对中,使用 SAMtools 和 VarScan 过滤序列并调用 SNP。使用 PLINK 包进行 GWAS 以获得显著的 SNPs,并将其映射到基因上。使用 GWAS 计算的基因的 P 值被转化为热向量。将热向量和来自 STRING 数据库的 Mtb 蛋白-蛋白相互作用(PPI)输入到热扩散模型中,使用 HotNet2 获得显著的子网。最后,获得与不同表型相关的最显著(P<0.05)子网。为了验证突变前后药物与靶标之间结合能的变化,使用 AMBER 软件进行了分子动力学模拟。
我们在利福平耐药样本中鉴定出了显著的子网。令人兴奋的是,我们发现了 rpoB 和 rpoC,它们是利福平的药物靶点。从 rpoB 的蛋白质结构来看,突变位置极其接近药物结合位点,距离仅为 3.97Å。分子动力学模拟表明,rpoB 和利福平结合后,其结合能降低。在很大程度上,这种突变会影响药物-靶标结合的亲和力。此外,topA 和 pyrG 被报道与耐药性有关,可能是新的结核病药物靶点。其他尚未报道的基因值得进一步研究。
使用热扩散模型结合 GWAS 结果和蛋白质-蛋白质相互作用,发现了利福平耐药样本中显著突变的子网。该子网不仅包含了已知的利福平靶点(rpoB、rpoC),还包含了 topA 和 pyrG,它们可能与耐药性有关。这些结果深入了解了 Mtb 的耐药性,并为寻找新的抗结核药物提供了潜在的药物靶点。