Zhang Yu, Zhang Cong, Huo Wenwen, Wang Xinlei, Zhang Michael, Palmer Kelli, Chen Min
School of Mathematical Sciences, Ocean University of China, Qingdao, 266000 China.
Department of Mathematical Sciences, University of Texas at Dallas, Richardson, TX 75080 USA.
Mar Life Sci Technol. 2023;5(1):28-43. doi: 10.1007/s42995-022-00144-z. Epub 2023 Jan 31.
The emergence of antibiotic resistance in bacteria limits the availability of antibiotic choices for treatment and infection control, thereby representing a major threat to human health. The de novo mutation of bacterial genomes is an essential mechanism by which bacteria acquire antibiotic resistance. Previously, deletion mutations within bacterial immune systems, ranging from dozens to thousands of base pairs (bps) in length, have been associated with the spread of antibiotic resistance. Most current methods for evaluating genomic structural variations (SVs) have concentrated on detecting them, rather than estimating the proportions of populations that carry distinct SVs. A better understanding of the distribution of mutations and subpopulations dynamics in bacterial populations is needed to appreciate antibiotic resistance evolution and movement of resistance genes through populations. Here, we propose a statistical model to estimate the proportions of genomic deletions in a mixed population based on Expectation-Maximization (EM) algorithms and next-generation sequencing (NGS) data. The method integrates both insert size and split-read mapping information to iteratively update estimated distributions. The proposed method was evaluated with three simulations that demonstrated the production of accurate estimations. The proposed method was then applied to investigate the horizontal transfers of antibiotic resistance genes in concert with changes in the CRISPR-Cas system of .
The online version contains supplementary material available at 10.1007/s42995-022-00144-z.
细菌中抗生素耐药性的出现限制了用于治疗和感染控制的抗生素选择的可用性,从而对人类健康构成重大威胁。细菌基因组的从头突变是细菌获得抗生素耐药性的一种重要机制。此前,细菌免疫系统内长度从几十到数千个碱基对(bps)的缺失突变已与抗生素耐药性的传播相关联。目前大多数评估基因组结构变异(SVs)的方法都集中在检测它们,而不是估计携带不同SVs的群体比例。为了理解抗生素耐药性的演变以及耐药基因在群体中的移动情况,需要更好地了解细菌群体中突变的分布和亚群体动态。在此,我们提出一种基于期望最大化(EM)算法和下一代测序(NGS)数据来估计混合群体中基因组缺失比例的统计模型。该方法整合了插入片段大小和分裂读段映射信息,以迭代更新估计分布。通过三个模拟对所提出的方法进行了评估,结果表明该方法能产生准确的估计值。然后将所提出的方法应用于研究抗生素耐药基因的水平转移以及与[具体内容未给出]的CRISPR - Cas系统变化的协同情况。
在线版本包含可在10.1007/s42995 - 022 - 00144 - z获取的补充材料。