• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BugMat和FindNeighbour:用于调查细菌亲缘关系的命令行和服务器应用程序。

BugMat and FindNeighbour: command line and server applications for investigating bacterial relatedness.

作者信息

Mazariegos-Canellas Oriol, Do Trien, Peto Tim, Eyre David W, Underwood Anthony, Crook Derrick, Wyllie David H

机构信息

Nuffield Department of Medicine, John Radcliffe Hospital, Headley Way, Oxford, OX3 9DU, UK.

Public Health England, 61 Colindale Avenue, London, NW9 5EQ, UK.

出版信息

BMC Bioinformatics. 2017 Nov 13;18(1):477. doi: 10.1186/s12859-017-1907-2.

DOI:10.1186/s12859-017-1907-2
PMID:29132318
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5683244/
Abstract

BACKGROUND

Large scale bacterial sequencing has made the determination of genetic relationships within large sequence collections of bacterial genomes derived from the same microbial species an increasingly common task. Solutions to the problem have application to public health (for example, in the detection of possible disease transmission), and as part of divide-and-conquer strategies selecting groups of similar isolates for computationally intensive methods of phylogenetic inference using (for example) maximal likelihood methods. However, the generation and maintenance of distance matrices is computationally intensive, and rapid methods of doing so are needed to allow translation of microbial genomics into public health actions.

RESULTS

We developed, tested and deployed three solutions. BugMat is a fast C++ application which generates one-off in-memory distance matrices. FindNeighbour and FindNeighbour2 are server-side applications which build, maintain, and persist either complete (for FindNeighbour) or sparse (for FindNeighbour2) distance matrices given a set of sequences. FindNeighbour and BugMat use a variation model to accelerate computation, while FindNeighbour2 uses reference-based compression. Performance metrics show scalability into tens of thousands of sequences, with options for scaling further.

CONCLUSION

Three applications, each with distinct strengths and weaknesses, are available for distance-matrix based analysis of large bacterial collections. Deployed as part of the Public Health England solution for M. tuberculosis genomic processing, they will have wide applicability.

摘要

背景

大规模细菌测序已使确定源自同一微生物物种的大量细菌基因组序列集合中的遗传关系成为一项日益常见的任务。该问题的解决方案可应用于公共卫生领域(例如,检测可能的疾病传播),并且作为分而治之策略的一部分,选择相似分离株群体用于使用(例如)最大似然法等计算密集型系统发育推断方法。然而,距离矩阵的生成和维护计算量很大,因此需要快速方法来实现将微生物基因组学转化为公共卫生行动。

结果

我们开发、测试并部署了三种解决方案。BugMat是一个快速的C++应用程序,可一次性生成内存中的距离矩阵。FindNeighbour和FindNeighbour2是服务器端应用程序,给定一组序列后,它们可以构建、维护并持久化完整的(用于FindNeighbour)或稀疏的(用于FindNeighbour2)距离矩阵。FindNeighbour和BugMat使用变异模型来加速计算,而FindNeighbour2使用基于参考的压缩。性能指标表明可扩展到数万个序列,并且还有进一步扩展的选项。

结论

有三种应用程序可用于对大量细菌集合进行基于距离矩阵的分析,每种应用程序都有各自的优缺点。作为英国公共卫生部门结核分枝杆菌基因组处理解决方案的一部分进行部署,它们将具有广泛的适用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f721/5683244/f69f05885da5/12859_2017_1907_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f721/5683244/e0e53b55c756/12859_2017_1907_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f721/5683244/f69f05885da5/12859_2017_1907_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f721/5683244/e0e53b55c756/12859_2017_1907_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f721/5683244/f69f05885da5/12859_2017_1907_Fig2_HTML.jpg

相似文献

1
BugMat and FindNeighbour: command line and server applications for investigating bacterial relatedness.BugMat和FindNeighbour:用于调查细菌亲缘关系的命令行和服务器应用程序。
BMC Bioinformatics. 2017 Nov 13;18(1):477. doi: 10.1186/s12859-017-1907-2.
2
Catwalk: identifying closely related sequences in large microbial sequence databases.Catwalk:在大型微生物序列数据库中识别密切相关的序列。
Microb Genom. 2022 Jun;8(6). doi: 10.1099/mgen.0.000850.
3
Efficient and robust search of microbial genomes via phylogenetic compression.通过系统发育压缩对微生物基因组进行高效且稳健的搜索。
Nat Methods. 2025 Apr;22(4):692-697. doi: 10.1038/s41592-025-02625-2. Epub 2025 Apr 9.
4
Comparative Whole-Genomic Analysis of an Ancient L2 Lineage Reveals a Novel Phylogenetic Clade and Common Genetic Determinants of Hypervirulent Strains.古 L2 谱系的全基因组比较分析揭示了一个新的进化枝和高毒力株的常见遗传决定因素。
Front Cell Infect Microbiol. 2018 Jan 12;7:539. doi: 10.3389/fcimb.2017.00539. eCollection 2017.
5
BPhyOG: an interactive server for genome-wide inference of bacterial phylogenies based on overlapping genes.BPhyOG:一个基于重叠基因进行全基因组细菌系统发育推断的交互式服务器。
BMC Bioinformatics. 2007 Jul 25;8:266. doi: 10.1186/1471-2105-8-266.
6
Reconstructing the Ancestral Relationships Between Bacterial Pathogen Genomes.重建细菌病原体基因组之间的祖先关系。
Methods Mol Biol. 2017;1535:109-137. doi: 10.1007/978-1-4939-6673-8_8.
7
Fast Phylogeny Reconstruction from Genomes of Closely Related Microbes.快速重建密切相关微生物的基因组系统发育。
Methods Mol Biol. 2021;2242:77-89. doi: 10.1007/978-1-0716-1099-2_6.
8
KvarQ: targeted and direct variant calling from fastq reads of bacterial genomes.KvarQ:从细菌基因组的fastq读段中进行靶向和直接变异检测
BMC Genomics. 2014 Oct 9;15(1):881. doi: 10.1186/1471-2164-15-881.
9
Rapid and accurate SNP genotyping of clonal bacterial pathogens with BioHansel.利用 BioHansel 对克隆细菌病原体进行快速准确的 SNP 基因分型。
Microb Genom. 2021 Sep;7(9). doi: 10.1099/mgen.0.000651.
10
zDB: bacterial comparative genomics made easy.zDB:轻松进行细菌比较基因组学研究。
mSystems. 2024 Jul 23;9(7):e0047324. doi: 10.1128/msystems.00473-24. Epub 2024 Jun 28.

引用本文的文献

1
Catwalk: identifying closely related sequences in large microbial sequence databases.Catwalk:在大型微生物序列数据库中识别密切相关的序列。
Microb Genom. 2022 Jun;8(6). doi: 10.1099/mgen.0.000850.
2
Infection prevention and control insights from a decade of pathogen whole-genome sequencing.从十年的病原体全基因组测序中获得的感染预防和控制见解。
J Hosp Infect. 2022 Apr;122:180-186. doi: 10.1016/j.jhin.2022.01.024. Epub 2022 Feb 12.
3
A publicly accessible database for genome sequences supports tracing of transmission chains and epidemics.

本文引用的文献

1
Whole-Genome Sequencing Reveals the Contribution of Long-Term Carriers in Staphylococcus aureus Outbreak Investigation.全基因组测序揭示了长期携带者在金黄色葡萄球菌暴发调查中的作用。
J Clin Microbiol. 2017 Jul;55(7):2188-2197. doi: 10.1128/JCM.00363-17. Epub 2017 May 3.
2
Prospective use of whole genome sequencing (WGS) detected a multi-country outbreak of Salmonella Enteritidis.前瞻性地使用全基因组测序(WGS)检测到肠炎沙门氏菌的多国暴发。
Epidemiol Infect. 2017 Jan;145(2):289-298. doi: 10.1017/S0950268816001941. Epub 2016 Oct 26.
3
Comparison of high-throughput sequencing data compression tools.
一个公开的基因组序列数据库支持追踪传播链和疫情。
Microb Genom. 2020 Aug;6(8). doi: 10.1099/mgen.0.000410. Epub 2020 Jul 29.
4
Computational pan-genome mapping and pairwise SNP-distance improve detection of Mycobacterium tuberculosis transmission clusters.计算泛基因组图谱和 SNP 对距离可提高结核分枝杆菌传播集群的检测能力。
PLoS Comput Biol. 2019 Dec 9;15(12):e1007527. doi: 10.1371/journal.pcbi.1007527. eCollection 2019 Dec.
5
Hash-Based Core Genome Multilocus Sequence Typing for Clostridium difficile.基于哈希的艰难梭菌核心基因组多位点序列分型。
J Clin Microbiol. 2019 Dec 23;58(1). doi: 10.1128/JCM.01037-19.
6
A Quantitative Evaluation of MIRU-VNTR Typing Against Whole-Genome Sequencing for Identifying Mycobacterium tuberculosis Transmission: A Prospective Observational Cohort Study.一种基于 MIRU-VNTR 分型和全基因组测序对结核分枝杆菌传播进行定量评估的前瞻性观察性队列研究。
EBioMedicine. 2018 Aug;34:122-130. doi: 10.1016/j.ebiom.2018.07.019. Epub 2018 Aug 1.
7
GrapeTree: visualization of core genomic relationships among 100,000 bacterial pathogens.葡萄树:可视化 100000 种细菌病原体核心基因组关系。
Genome Res. 2018 Sep;28(9):1395-1404. doi: 10.1101/gr.232397.117. Epub 2018 Jul 26.
8
Control of Artifactual Variation in Reported Intersample Relatedness during Clinical Use of a Mycobacterium tuberculosis Sequencing Pipeline.临床应用结核分枝杆菌测序流程时报告样本间相关性的人为变异控制。
J Clin Microbiol. 2018 Jul 26;56(8). doi: 10.1128/JCM.00104-18. Print 2018 Aug.
高通量测序数据压缩工具比较。
Nat Methods. 2016 Dec;13(12):1005-1008. doi: 10.1038/nmeth.4037. Epub 2016 Oct 24.
4
Whole-genome sequencing to determine transmission of Neisseria gonorrhoeae: an observational study.全基因组测序确定淋病奈瑟菌的传播:一项观察性研究。
Lancet Infect Dis. 2016 Nov;16(11):1295-1303. doi: 10.1016/S1473-3099(16)30157-8. Epub 2016 Jul 12.
5
Prevalence of LRTI in Patients Presenting with Productive Cough and Their Antibiotic Resistance Pattern.有咳痰症状患者下呼吸道感染的患病率及其抗生素耐药模式。
J Clin Diagn Res. 2016 Jan;10(1):DC09-12. doi: 10.7860/JCDR/2016/17855.7082. Epub 2016 Jan 1.
6
Parallel Mutual Information Based Construction of Genome-Scale Networks on the Intel® Xeon Phi™ Coprocessor.基于并行互信息的英特尔® 至强融核™ 协处理器上基因组规模网络的构建
IEEE/ACM Trans Comput Biol Bioinform. 2015 Sep-Oct;12(5):1008-20. doi: 10.1109/TCBB.2015.2415931.
7
Scalable Nearest Neighbor Algorithms for High Dimensional Data.高维数据的可扩展最近邻算法。
IEEE Trans Pattern Anal Mach Intell. 2014 Nov;36(11):2227-40. doi: 10.1109/TPAMI.2014.2321376.
8
Collaborative tuberculosis strategy for England.英格兰的结核病合作策略。
BMJ. 2015 Feb 19;350:h810. doi: 10.1136/bmj.h810.
9
ClonalFrameML: efficient inference of recombination in whole bacterial genomes.ClonalFrameML:高效推断全细菌基因组中的重组。
PLoS Comput Biol. 2015 Feb 12;11(2):e1004041. doi: 10.1371/journal.pcbi.1004041. eCollection 2015 Feb.
10
IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies.IQ-TREE:一种用于估计最大似然系统发育树的快速且有效的随机算法。
Mol Biol Evol. 2015 Jan;32(1):268-74. doi: 10.1093/molbev/msu300. Epub 2014 Nov 3.