Suppr超能文献

超越 SNP 阈值:利用推断的传播来识别暴发集群。

Beyond the SNP Threshold: Identifying Outbreak Clusters Using Inferred Transmissions.

机构信息

Department of Mathematics, Imperial College London, London, UK.

British Columbia Centre for Disease Control, Communicable Disease Prevention and Control Services, Vancouver, Canada.

出版信息

Mol Biol Evol. 2019 Mar 1;36(3):587-603. doi: 10.1093/molbev/msy242.

Abstract

Whole-genome sequencing (WGS) is increasingly used to aid the understanding of pathogen transmission. A first step in analyzing WGS data is usually to define "transmission clusters," sets of cases that are potentially linked by direct transmission. This is often done by including two cases in the same cluster if they are separated by fewer single-nucleotide polymorphisms (SNPs) than a specified threshold. However, there is little agreement as to what an appropriate threshold should be. We propose a probabilistic alternative, suggesting that the key inferential target for transmission clusters is the number of transmissions separating cases. We characterize this by combining the number of SNP differences and the length of time over which those differences have accumulated, using information about case timing, molecular clock, and transmission processes. Our framework has the advantage of allowing for variable mutation rates across the genome and can incorporate other epidemiological data. We use two tuberculosis studies to illustrate the impact of our approach: with British Columbia data by using spatial divisions; with Republic of Moldova data by incorporating antibiotic resistance. Simulation results indicate that our transmission-based method is better in identifying direct transmissions than a SNP threshold, with dissimilarity between clusterings of on average 0.27 bits compared with 0.37 bits for the SNP-threshold method and 0.84 bits for randomly permuted data. These results show that it is likely to outperform the SNP-threshold method where clock rates are variable and sample collection times are spread out. We implement the method in the R package transcluster.

摘要

全基因组测序(WGS)越来越多地用于帮助理解病原体的传播。分析 WGS 数据的第一步通常是定义“传播簇”,即通过直接传播潜在相关的病例集。这通常是通过将两个病例包含在同一簇中完成的,如果它们之间的单核苷酸多态性(SNP)差异少于指定的阈值。然而,对于应该使用什么适当的阈值,还没有达成共识。我们提出了一种概率替代方法,建议将传播簇的关键推断目标定义为将病例分开的传播次数。我们通过结合 SNP 差异的数量和积累这些差异的时间长度来实现这一点,利用有关病例时间、分子钟和传播过程的信息。我们的框架具有允许跨基因组的可变突变率的优势,并且可以纳入其他流行病学数据。我们使用两项结核病研究来说明我们的方法的影响:使用不列颠哥伦比亚的数据进行空间划分;使用摩尔多瓦共和国的数据纳入抗生素耐药性。模拟结果表明,我们基于传输的方法比 SNP 阈值更能识别直接传输,聚类之间的不相似性平均为 0.27 位,而 SNP 阈值方法为 0.37 位,随机排列数据为 0.84 位。这些结果表明,在时钟速率可变且样本采集时间分散的情况下,它很可能优于 SNP 阈值方法。我们在 R 包 transcluster 中实现了该方法。

相似文献

10
Role and value of whole genome sequencing in studying tuberculosis transmission.全基因组测序在研究结核病传播中的作用和价值。
Clin Microbiol Infect. 2019 Nov;25(11):1377-1382. doi: 10.1016/j.cmi.2019.03.022. Epub 2019 Apr 11.

引用本文的文献

本文引用的文献

2
When are pathogen genome sequences informative of transmission events?病原体基因组序列在何时能提供有关传播事件的信息?
PLoS Pathog. 2018 Feb 8;14(2):e1006885. doi: 10.1371/journal.ppat.1006885. eCollection 2018 Feb.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验