• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

麻叶千里光:通过将数百万个基因组嵌入到低维表示中,可视化微生物种群结构。

Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation.

机构信息

MRC Centre for Global Infectious Disease Analysis, School of Public Health, Imperial College London, London W2 1PG, UK.

European Molecular Biology Laboratory, European Bioinformatics Institute EMBL-EBI, Hinxton CB10 1SD, UK.

出版信息

Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210237. doi: 10.1098/rstb.2021.0237. Epub 2022 Aug 22.

DOI:10.1098/rstb.2021.0237
PMID:35989601
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9393562/
Abstract

In less than a decade, population genomics of microbes has progressed from the effort of sequencing dozens of strains to thousands, or even tens of thousands of strains in a single study. There are now hundreds of thousands of genomes available even for a single bacterial species, and the number of genomes is expected to continue to increase at an accelerated pace given the advances in sequencing technology and widespread genomic surveillance initiatives. This explosion of data calls for innovative methods to enable rapid exploration of the structure of a population based on different data modalities, such as multiple sequence alignments, assemblies and estimates of gene content across different genomes. Here, we present Mandrake, an efficient implementation of a dimensional reduction method tailored for the needs of large-scale population genomics. Mandrake is capable of visualizing population structure from millions of whole genomes, and we illustrate its usefulness with several datasets representing major pathogens. Our method is freely available both as an analysis pipeline (https://github.com/johnlees/mandrake) and as a browser-based interactive application (https://gtonkinhill.github.io/mandrake-web/). This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.

摘要

在不到十年的时间里,微生物群体基因组学的发展已经从对数十个菌株进行测序的努力,进步到了在单个研究中对数千个,甚至数万种菌株进行测序。即使对于单一的细菌物种,现在也有数十万个基因组可供使用,而且随着测序技术的进步和广泛的基因组监测计划的开展,基因组的数量预计将继续以更快的速度增长。数据的爆炸式增长要求采用创新的方法,以便能够根据不同的数据模态(例如,多序列比对、组装和不同基因组中基因含量的估计)快速探索群体的结构。在这里,我们提出了 Mandrake,这是一种针对大规模群体基因组学需求定制的降维方法的高效实现。Mandrake 能够从数百万个全基因组中可视化群体结构,我们用几个代表主要病原体的数据集来说明其有用性。我们的方法既可以作为分析管道(https://github.com/johnlees/mandrake),也可以作为基于浏览器的交互式应用程序(https://gtonkinhill.github.io/mandrake-web/)免费使用。本文是“微生物病原体的基因组群体结构”讨论会议议题的一部分。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0df/9393562/b5f27ae5da4f/rstb20210237f05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0df/9393562/fecd68cc95a0/rstb20210237f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0df/9393562/f21aa5afade8/rstb20210237f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0df/9393562/a72085e04839/rstb20210237f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0df/9393562/1c68691d31aa/rstb20210237f04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0df/9393562/b5f27ae5da4f/rstb20210237f05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0df/9393562/fecd68cc95a0/rstb20210237f01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0df/9393562/f21aa5afade8/rstb20210237f02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0df/9393562/a72085e04839/rstb20210237f03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0df/9393562/1c68691d31aa/rstb20210237f04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0df/9393562/b5f27ae5da4f/rstb20210237f05.jpg

相似文献

1
Mandrake: visualizing microbial population structure by embedding millions of genomes into a low-dimensional representation.麻叶千里光:通过将数百万个基因组嵌入到低维表示中,可视化微生物种群结构。
Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210237. doi: 10.1098/rstb.2021.0237. Epub 2022 Aug 22.
2
GenomeView: a next-generation genome browser.基因组浏览器:下一代基因组浏览器。
Nucleic Acids Res. 2012 Jan;40(2):e12. doi: 10.1093/nar/gkr995. Epub 2011 Nov 18.
3
wgatools: an ultrafast toolkit for manipulating whole-genome alignments.wgatools:一个用于操作全基因组比对的超快速工具包。
Bioinformatics. 2025 Mar 29;41(4). doi: 10.1093/bioinformatics/btaf132.
4
Phandango: an interactive viewer for bacterial population genomics.凡丹戈:一种用于细菌群体基因组学的交互式查看器。
Bioinformatics. 2018 Jan 15;34(2):292-293. doi: 10.1093/bioinformatics/btx610.
5
GCViT: a method for interactive, genome-wide visualization of resequencing and SNP array data.GCViT:一种用于交互式、全基因组重测序和 SNP 数组数据可视化的方法。
BMC Genomics. 2020 Nov 23;21(1):822. doi: 10.1186/s12864-020-07217-2.
6
The Harvest suite for rapid core-genome alignment and visualization of thousands of intraspecific microbial genomes.用于数千个种内微生物基因组的快速核心基因组比对和可视化的Harvest套件。
Genome Biol. 2014;15(11):524. doi: 10.1186/s13059-014-0524-x.
7
GEnView: a gene-centric, phylogeny-based comparative genomics pipeline for bacterial genomes and plasmids.GEnView:一种基于基因和系统发育的细菌基因组和质粒比较基因组学分析工具。
Bioinformatics. 2022 Mar 4;38(6):1727-1728. doi: 10.1093/bioinformatics/btab855.
8
Recent progress on methods for estimating and updating large phylogenies.关于估计和更新大型系统发育树的方法的最新进展。
Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210244. doi: 10.1098/rstb.2021.0244. Epub 2022 Aug 22.
9
seq-seq-pan: building a computational pan-genome data structure on whole genome alignment.seq-seq-pan:在全基因组比对的基础上构建计算泛基因组数据结构。
BMC Genomics. 2018 Jan 15;19(1):47. doi: 10.1186/s12864-017-4401-3.
10
MoMI-G: modular multi-scale integrated genome graph browser.MoMI-G:模块化多尺度综合基因组图谱浏览器。
BMC Bioinformatics. 2019 Nov 5;20(1):548. doi: 10.1186/s12859-019-3145-2.

引用本文的文献

1
Integrated population clustering and genomic epidemiology with PopPIPE.利用PopPIPE进行综合人群聚类和基因组流行病学研究。
Microb Genom. 2025 Apr;11(4). doi: 10.1099/mgen.0.001404.
2
Genomic Biosurveillance of the Kiwifruit Pathogen Pseudomonas syringae pv. actinidiae Biovar 3 Reveals Adaptation to Selective Pressures in New Zealand Orchards.奇异果病原菌丁香假单胞菌猕猴桃致病变种3的基因组生物监测揭示了其对新西兰果园选择压力的适应性。
Mol Plant Pathol. 2025 Feb;26(2):e70056. doi: 10.1111/mpp.70056.
3
Development of the Pneumococcal Genome Library, a core genome multilocus sequence typing scheme, and a taxonomic life identification number barcoding system to investigate and define pneumococcal population structure.

本文引用的文献

1
Reproducible parallel inference and simulation of stochastic state space models using odin, dust, and mcstate.使用odin、dust和mcstate对随机状态空间模型进行可重复的并行推理和模拟。
Wellcome Open Res. 2021 Jun 10;5:288. doi: 10.12688/wellcomeopenres.16466.2. eCollection 2020.
2
Exploring bacterial diversity via a curated and searchable snapshot of archived DNA sequences.通过对存档DNA序列的精心整理和可搜索快照探索细菌多样性。
PLoS Biol. 2021 Nov 9;19(11):e3001421. doi: 10.1371/journal.pbio.3001421. eCollection 2021 Nov.
3
Genomic reconstruction of the SARS-CoV-2 epidemic in England.
肺炎球菌基因组文库的开发、核心基因组多位点序列分型方案以及分类学生命识别号码条码系统,用于研究和定义肺炎球菌种群结构。
Microb Genom. 2024 Aug;10(8). doi: 10.1099/mgen.0.001280.
4
EnteroBase: hierarchical clustering of 100 000s of bacterial genomes into species/subspecies and populations.EnteroBase:将数万个细菌基因组按种/亚种和种群进行层次聚类。
Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210240. doi: 10.1098/rstb.2021.0240. Epub 2022 Aug 22.
5
Genomic population structures of microbial pathogens.微生物病原体的基因组群体结构。
Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210230. doi: 10.1098/rstb.2021.0230. Epub 2022 Aug 22.
6
Recent progress on methods for estimating and updating large phylogenies.关于估计和更新大型系统发育树的方法的最新进展。
Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210244. doi: 10.1098/rstb.2021.0244. Epub 2022 Aug 22.
英格兰地区 SARS-CoV-2 疫情的基因组重建。
Nature. 2021 Dec;600(7889):506-511. doi: 10.1038/s41586-021-04069-y. Epub 2021 Oct 14.
4
An economic evaluation of the Whole Genome Sequencing source tracking program in the U.S.美国全基因组测序溯源项目的经济评估
PLoS One. 2021 Oct 6;16(10):e0258262. doi: 10.1371/journal.pone.0258262. eCollection 2021.
5
Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool.使用穿山甲工具对新出现的大流行中的流行病学谱系进行分类。
Virus Evol. 2021 Jul 30;7(2):veab064. doi: 10.1093/ve/veab064. eCollection 2021.
6
The next phase of SARS-CoV-2 surveillance: real-time molecular epidemiology.严重急性呼吸综合征冠状病毒2(SARS-CoV-2)监测的下一阶段:实时分子流行病学
Nat Med. 2021 Sep;27(9):1518-1524. doi: 10.1038/s41591-021-01472-w. Epub 2021 Sep 9.
7
Ultrafast Sample placement on Existing tRees (UShER) enables real-time phylogenetics for the SARS-CoV-2 pandemic.超快现有树木样本放置 (UShER) 可实现 SARS-CoV-2 大流行的实时系统发生学。
Nat Genet. 2021 Jun;53(6):809-816. doi: 10.1038/s41588-021-00862-7. Epub 2021 May 10.
8
HierCC: a multi-level clustering scheme for population assignments based on core genome MLST.HierCC:一种基于核心基因组多位点序列分型的群体分配多层次聚类方案。
Bioinformatics. 2021 Oct 25;37(20):3645-3646. doi: 10.1093/bioinformatics/btab234.
9
Frequency-dependent selection can forecast evolution in Streptococcus pneumoniae.频率依赖选择可以预测肺炎链球菌的进化。
PLoS Biol. 2020 Oct 22;18(10):e3000878. doi: 10.1371/journal.pbio.3000878. eCollection 2020 Oct.
10
A review of UMAP in population genetics.UMAP 在群体遗传学中的应用综述。
J Hum Genet. 2021 Jan;66(1):85-91. doi: 10.1038/s10038-020-00851-4. Epub 2020 Oct 14.