• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

泛基因组矩阵的分解揭示了该物种基因分布的一种结构。

Decomposition of the pangenome matrix reveals a structure in gene distribution in the species.

作者信息

Chauhan Siddharth M, Ardalani Omid, Hyun Jason C, Monk Jonathan M, Phaneuf Patrick V, Palsson Bernhard O

机构信息

Department of Bioengineering, University of California, San Diego, La Jolla, California, USA.

Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kemitorvet, Kongens, Lyngby, Denmark.

出版信息

mSphere. 2025 Jan 28;10(1):e0053224. doi: 10.1128/msphere.00532-24. Epub 2024 Dec 31.

DOI:10.1128/msphere.00532-24
PMID:39745367
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11774025/
Abstract

UNLABELLED

Thousands of complete genome sequences for strains of a species that are now available enable the advancement of pangenome analytics to a new level of sophistication. We collected 2,377 publicly available complete genomes of for detailed pangenome analysis. The core genome and accessory genomes consisted of 2,398 and 5,182 genes, respectively. We developed a machine learning approach to define the accessory genes characterizing the major phylogroups of plus : A, B1, B2, C, D, E, F, G, and . The analysis resulted in a detailed structure of the genetic basis of the phylogroups' differential traits. This pangenome structure was largely consistent with a housekeeping-gene-based MLST distribution, sequence-based Mash distance, and the Clermont quadruplex classification. The rare genome (consisting of genes found in <6.8% of all strains) consisted of 163,619 genes, about 79% of which represented variations of 315 underlying transposon elements. This analysis generated a mathematical definition of the genetic basis for a species.

IMPORTANCE

The comprehensive analysis of the pangenome of presented in this study marks a significant advancement in understanding bacterial genetic diversity. By employing machine learning techniques to analyze 2,377 complete genomes, the study provides a detailed mapping of core, accessory, and rare genes. This approach reveals the genetic basis for differential traits across phylogroups, offering insights into pathogenicity, antibiotic resistance, and evolutionary adaptations. The findings enhance the potential for genome-based diagnostics and pave the way for future studies aimed at achieving a global genetic definition of bacterial phylogeny.

摘要

未标记

现在已有一个物种菌株的数千个完整基因组序列,这使得泛基因组分析能够提升到一个新的复杂程度。我们收集了2377个公开可用的完整基因组用于详细的泛基因组分析。核心基因组和辅助基因组分别由2398个和5182个基因组成。我们开发了一种机器学习方法来定义表征该物种加上某些其他分类群(A、B1、B2、C、D、E、F、G以及某些其他分类群)主要系统发育群的辅助基因。分析得出了系统发育群差异特征遗传基础的详细结构。这种泛基因组结构在很大程度上与基于管家基因的多位点序列分型分布、基于序列的Mash距离以及克莱蒙特四重分类法一致。稀有基因组(由在所有菌株中<6.8%的菌株中发现的基因组成)由163619个基因组成,其中约79%代表315个潜在转座子元件的变异。该分析生成了一个物种遗传基础的数学定义。

重要性

本研究中对该物种泛基因组的全面分析标志着在理解细菌遗传多样性方面取得了重大进展。通过运用机器学习技术分析2377个完整的该物种基因组,该研究提供了核心、辅助和稀有基因的详细图谱。这种方法揭示了不同系统发育群差异特征的遗传基础,为致病性、抗生素抗性和进化适应性提供了见解。这些发现增强了基于基因组的诊断潜力,并为旨在实现细菌系统发育全球遗传定义的未来研究铺平了道路。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8b1/11774025/301c0b7389f7/msphere.00532-24.f005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8b1/11774025/f4c7fdfa790e/msphere.00532-24.f001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8b1/11774025/76eefd10e72c/msphere.00532-24.f002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8b1/11774025/dd5f807a8478/msphere.00532-24.f003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8b1/11774025/e838b0f15a0a/msphere.00532-24.f004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8b1/11774025/301c0b7389f7/msphere.00532-24.f005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8b1/11774025/f4c7fdfa790e/msphere.00532-24.f001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8b1/11774025/76eefd10e72c/msphere.00532-24.f002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8b1/11774025/dd5f807a8478/msphere.00532-24.f003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8b1/11774025/e838b0f15a0a/msphere.00532-24.f004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d8b1/11774025/301c0b7389f7/msphere.00532-24.f005.jpg

相似文献

1
Decomposition of the pangenome matrix reveals a structure in gene distribution in the species.泛基因组矩阵的分解揭示了该物种基因分布的一种结构。
mSphere. 2025 Jan 28;10(1):e0053224. doi: 10.1128/msphere.00532-24. Epub 2024 Dec 31.
2
Mash-based analyses of Escherichia coli genomes reveal 14 distinct phylogroups.基于 mash 的大肠杆菌基因组分析揭示了 14 个不同的系统发育群。
Commun Biol. 2021 Jan 26;4(1):117. doi: 10.1038/s42003-020-01626-5.
3
A phylogenomic analysis of Escherichia coli / Shigella group: implications of genomic features associated with pathogenicity and ecological adaptation.大肠杆菌/志贺氏菌群的系统基因组分析:与致病性和生态适应性相关的基因组特征的意义。
BMC Evol Biol. 2012 Sep 7;12:174. doi: 10.1186/1471-2148-12-174.
4
To kill or to be killed: pangenome analysis of Escherichia coli strains reveals a tailocin specific for pandemic ST131.生死抉择:大肠杆菌菌株的泛基因组分析揭示了一种针对流行 ST131 的尾菌素特异性。
BMC Biol. 2022 Jun 16;20(1):146. doi: 10.1186/s12915-022-01347-7.
5
Identification of Escherichia coli and Shigella Species from Whole-Genome Sequences.从全基因组序列中鉴定大肠杆菌和志贺氏菌属
J Clin Microbiol. 2017 Feb;55(2):616-623. doi: 10.1128/JCM.01790-16. Epub 2016 Dec 14.
6
Comparative genomics of enterohemorrhagic Escherichia coli O145:H28 demonstrates a common evolutionary lineage with Escherichia coli O157:H7.肠出血性大肠杆菌 O145:H28 的比较基因组学研究表明其与大肠杆菌 O157:H7 具有共同的进化谱系。
BMC Genomics. 2014 Jan 10;15:17. doi: 10.1186/1471-2164-15-17.
7
Estimating variation within the genes and inferring the phylogeny of 186 sequenced diverse Escherichia coli genomes.估计 186 个不同的已测序大肠杆菌基因组内的基因变异,并推断其系统发育。
BMC Genomics. 2012 Oct 31;13:577. doi: 10.1186/1471-2164-13-577.
8
An Escherichia coli ST131 pangenome atlas reveals population structure and evolution across 4,071 isolates.大肠杆菌 ST131 泛基因组图谱揭示了 4071 个分离株的种群结构和进化。
Sci Rep. 2019 Nov 22;9(1):17394. doi: 10.1038/s41598-019-54004-5.
9
Extensive genomic diversity in pathogenic Escherichia coli and Shigella Strains revealed by comparative genomic hybridization microarray.通过比较基因组杂交微阵列揭示的致病性大肠杆菌和志贺氏菌菌株中的广泛基因组多样性。
J Bacteriol. 2004 Jun;186(12):3911-21. doi: 10.1128/JB.186.12.3911-3921.2004.
10
Comparative genomics of European avian pathogenic E. Coli (APEC).欧洲禽致病性大肠杆菌(APEC)的比较基因组学
BMC Genomics. 2016 Nov 22;17(1):960. doi: 10.1186/s12864-016-3289-7.

引用本文的文献

1
Residence-colonization trade-off and niche differentiation enable coexistence of Escherichia coli phylogroups in healthy humans.居住-定殖权衡与生态位分化使大肠杆菌菌群在健康人体内共存。
ISME J. 2025 Jan 2;19(1). doi: 10.1093/ismejo/wraf089.

本文引用的文献

1
Bacterial genome-wide association study substantiates papGII of Escherichia coli as a major risk factor for urosepsis.细菌全基因组关联研究证实大肠埃希菌的 papGII 是尿脓毒症的主要危险因素。
Genome Med. 2023 Oct 30;15(1):89. doi: 10.1186/s13073-023-01243-x.
2
Insertion sequence transposition inactivates CRISPR-Cas immunity.插入序列转座使 CRISPR-Cas 免疫失活。
Nat Commun. 2023 Jul 20;14(1):4366. doi: 10.1038/s41467-023-39964-7.
3
Whole-genome sequences from wild-type and laboratory-evolved strains define the alleleome and establish its hallmarks.
从野生型和实验室进化株获得的全基因组序列定义了等位基因组,并确定了其特征。
Proc Natl Acad Sci U S A. 2023 Apr 11;120(15):e2218835120. doi: 10.1073/pnas.2218835120. Epub 2023 Apr 3.
4
Introducing the Bacterial and Viral Bioinformatics Resource Center (BV-BRC): a resource combining PATRIC, IRD and ViPR.推出细菌和病毒生物信息学资源中心(BV-BRC):一个整合 PATRIC、IRD 和 ViPR 的资源。
Nucleic Acids Res. 2023 Jan 6;51(D1):D678-D689. doi: 10.1093/nar/gkac1003.
5
CARD 2023: expanded curation, support for machine learning, and resistome prediction at the Comprehensive Antibiotic Resistance Database.CARD 2023:在全面抗生素耐药性数据库中进行扩展的策展、对机器学习的支持以及耐药组预测。
Nucleic Acids Res. 2023 Jan 6;51(D1):D690-D699. doi: 10.1093/nar/gkac920.
6
EnteroBase: hierarchical clustering of 100 000s of bacterial genomes into species/subspecies and populations.EnteroBase:将数万个细菌基因组按种/亚种和种群进行层次聚类。
Philos Trans R Soc Lond B Biol Sci. 2022 Oct 10;377(1861):20210240. doi: 10.1098/rstb.2021.0240. Epub 2022 Aug 22.
7
Comparative pangenomics: analysis of 12 microbial pathogen pangenomes reveals conserved global structures of genetic and functional diversity.比较泛基因组学:12 种微生物病原体泛基因组分析揭示了遗传和功能多样性的保守全球结构。
BMC Genomics. 2022 Jan 4;23(1):7. doi: 10.1186/s12864-021-08223-8.
8
eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale.eggNOG-mapper v2:宏基因组尺度的功能注释、直系同源物分配和结构域预测。
Mol Biol Evol. 2021 Dec 9;38(12):5825-5829. doi: 10.1093/molbev/msab293.
9
The E phylogroup of Escherichia coli is highly diverse and mimics the whole E. coli species population structure.大肠杆菌的E系统发育群高度多样,且模拟了整个大肠杆菌种群的结构。
Environ Microbiol. 2021 Nov;23(11):7139-7151. doi: 10.1111/1462-2920.15742. Epub 2021 Sep 9.
10
Sensitive protein alignments at tree-of-life scale using DIAMOND.使用 DIAMOND 进行生命之树尺度上的敏感蛋白质比对。
Nat Methods. 2021 Apr;18(4):366-368. doi: 10.1038/s41592-021-01101-x. Epub 2021 Apr 7.