• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CoreCruncher:快速稳健构建大型原核数据集的核心基因组。

CoreCruncher: Fast and Robust Construction of Core Genomes in Large Prokaryotic Data Sets.

机构信息

Department of Biology, University of North Carolina Greensboro, Greensboro, NC.

出版信息

Mol Biol Evol. 2021 Jan 23;38(2):727-734. doi: 10.1093/molbev/msaa224.

DOI:10.1093/molbev/msaa224
PMID:32886787
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7826169/
Abstract

The core genome represents the set of genes shared by all, or nearly all, strains of a given population or species of prokaryotes. Inferring the core genome is integral to many genomic analyses, however, most methods rely on the comparison of all the pairs of genomes; a step that is becoming increasingly difficult given the massive accumulation of genomic data. Here, we present CoreCruncher; a program that robustly and rapidly constructs core genomes across hundreds or thousands of genomes. CoreCruncher does not compute all pairwise genome comparisons and uses a heuristic based on the distributions of identity scores to classify sequences as orthologs or paralogs/xenologs. Although it is much faster than current methods, our results indicate that our approach is more conservative than other tools and less sensitive to the presence of paralogs and xenologs. CoreCruncher is freely available from: https://github.com/lbobay/CoreCruncher. CoreCruncher is written in Python 3.7 and can also run on Python 2.7 without modification. It requires the python library Numpy and either Usearch or Blast. Certain options require the programs muscle or mafft.

摘要

核心基因组代表了给定种群或原核生物物种中所有或几乎所有菌株所共有的基因集。推断核心基因组是许多基因组分析的重要组成部分,然而,大多数方法都依赖于所有基因组对的比较;随着基因组数据的大量积累,这一步变得越来越困难。在这里,我们介绍 CoreCruncher;这是一个程序,可以快速稳健地构建数百或数千个基因组的核心基因组。CoreCruncher 不会计算所有的成对基因组比较,而是使用基于身份分数分布的启发式方法将序列分类为直系同源物或旁系同源物/异源同源物。虽然它比当前的方法快得多,但我们的结果表明,我们的方法比其他工具更保守,对旁系同源物和异源同源物的存在不太敏感。CoreCruncher 可从以下网址免费获得:https://github.com/lbobay/CoreCruncher。CoreCruncher 是用 Python 3.7 编写的,也可以在不修改的情况下在 Python 2.7 上运行。它需要 Python 库 Numpy 和 Usearch 或 Blast。某些选项需要 muscle 或 mafft 程序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76be/7826169/96948300950b/msaa224f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76be/7826169/38bca0f3a010/msaa224f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76be/7826169/ab60d213d820/msaa224f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76be/7826169/fa2865365297/msaa224f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76be/7826169/96948300950b/msaa224f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76be/7826169/38bca0f3a010/msaa224f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76be/7826169/ab60d213d820/msaa224f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76be/7826169/fa2865365297/msaa224f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76be/7826169/96948300950b/msaa224f4.jpg

相似文献

1
CoreCruncher: Fast and Robust Construction of Core Genomes in Large Prokaryotic Data Sets.CoreCruncher:快速稳健构建大型原核数据集的核心基因组。
Mol Biol Evol. 2021 Jan 23;38(2):727-734. doi: 10.1093/molbev/msaa224.
2
ISEScan: automated identification of insertion sequence elements in prokaryotic genomes.ISEScan:原核生物基因组中插入序列元件的自动识别。
Bioinformatics. 2017 Nov 1;33(21):3340-3347. doi: 10.1093/bioinformatics/btx433.
3
Comparative Genomics for Prokaryotes.原核生物的比较基因组学
Methods Mol Biol. 2018;1704:55-78. doi: 10.1007/978-1-4939-7463-4_3.
4
SwiftOrtho: A fast, memory-efficient, multiple genome orthology classifier.SwiftOrtho:一种快速、内存高效、多基因组同源分类器。
Gigascience. 2019 Oct 1;8(10). doi: 10.1093/gigascience/giz118.
5
BLAST Ring Image Generator (BRIG): simple prokaryote genome comparisons.BLAST 环图像生成器(BRIG):简单的原核生物基因组比较。
BMC Genomics. 2011 Aug 8;12:402. doi: 10.1186/1471-2164-12-402.
6
A De-Novo Genome Analysis Pipeline (DeNoGAP) for large-scale comparative prokaryotic genomics studies.一种用于大规模比较原核生物基因组学研究的从头基因组分析流程(DeNoGAP)。
BMC Bioinformatics. 2016 Jun 30;17(1):260. doi: 10.1186/s12859-016-1142-2.
7
Balrog: A universal protein model for prokaryotic gene prediction.巴尔罗格:用于原核基因预测的通用蛋白质模型。
PLoS Comput Biol. 2021 Feb 26;17(2):e1008727. doi: 10.1371/journal.pcbi.1008727. eCollection 2021 Feb.
8
Genome trees constructed using five different approaches suggest new major bacterial clades.使用五种不同方法构建的基因组树表明了新的主要细菌进化枝。
BMC Evol Biol. 2001 Oct 20;1:8. doi: 10.1186/1471-2148-1-8.
9
10
ATGC: a database of orthologous genes from closely related prokaryotic genomes and a research platform for microevolution of prokaryotes.ATGC:一个来自密切相关原核生物基因组的直系同源基因数据库以及一个用于原核生物微观进化的研究平台。
Nucleic Acids Res. 2009 Jan;37(Database issue):D448-54. doi: 10.1093/nar/gkn684. Epub 2008 Oct 9.

引用本文的文献

1
Pangenome analysis indicates evolutionary origins and genetic diversity: emphasis on the role of nodulation in symbiotic .泛基因组分析揭示进化起源和遗传多样性:着重探讨结瘤在共生中的作用
Front Plant Sci. 2025 Apr 2;16:1539151. doi: 10.3389/fpls.2025.1539151. eCollection 2025.
2
Homologous recombination shapes the architecture and evolution of bacterial genomes.同源重组塑造了细菌基因组的结构和进化。
Nucleic Acids Res. 2025 Feb 8;53(4). doi: 10.1093/nar/gkae1265.
3
Co-evolution and Gene Transfers Drive Speciation Patterns in Host-Associated Bacteria.

本文引用的文献

1
Disentangling the impact of environmental and phylogenetic constraints on prokaryotic within-species diversity.解析环境和系统发育约束对原核生物种内多样性的影响。
ISME J. 2020 May;14(5):1247-1259. doi: 10.1038/s41396-020-0600-z. Epub 2020 Feb 11.
2
Primary orthologs from local sequence context.来自本地序列上下文的直系同源物。
BMC Bioinformatics. 2020 Feb 6;21(1):48. doi: 10.1186/s12859-020-3384-2.
3
Factors driving effective population size and pan-genome evolution in bacteria.驱动细菌有效种群规模和泛基因组进化的因素。
协同进化和基因转移驱动宿主相关细菌的物种形成模式。
Mol Biol Evol. 2024 Dec 6;41(12). doi: 10.1093/molbev/msae256.
4
Homologous Recombination Shapes the Architecture and Evolution of Bacterial Genomes.同源重组塑造细菌基因组的结构与进化。
bioRxiv. 2024 Jun 3:2024.05.31.596828. doi: 10.1101/2024.05.31.596828.
5
Evolution of homologous recombination rates across bacteria.细菌中同源重组率的演变。
Proc Natl Acad Sci U S A. 2024 Apr 30;121(18):e2316302121. doi: 10.1073/pnas.2316302121. Epub 2024 Apr 24.
6
Outbreak investigation of Serratia marcescens bloodstream infection in an obstetric ward for high-risk pregnant women.高危孕妇产科病房中黏质沙雷菌血流感染的暴发调查。
BMC Infect Dis. 2024 Feb 28;24(1):266. doi: 10.1186/s12879-024-09134-1.
7
Comparison of gene clustering criteria reveals intrinsic uncertainty in pangenome analyses.基因聚类标准的比较揭示了泛基因组分析中的固有不确定性。
Genome Biol. 2023 Oct 30;24(1):250. doi: 10.1186/s13059-023-03089-3.
8
Characterization and description of Gabonibacter chumensis sp. nov., isolated from feces of a patient with non-small cell lung cancer treated with immunotherapy.戈巴内杆菌属的分类学描述和鉴定:一种从接受免疫治疗的非小细胞肺癌患者粪便中分离到的新型细菌。
Arch Microbiol. 2023 Sep 24;205(10):338. doi: 10.1007/s00203-023-03671-0.
9
Widespread extinctions of co-diversified primate gut bacterial symbionts from humans.人类共生的灵长类肠道细菌伴生物广泛灭绝。
Nat Microbiol. 2023 Jun;8(6):1039-1050. doi: 10.1038/s41564-023-01388-w. Epub 2023 May 11.
10
Gene flow and introgression are pervasive forces shaping the evolution of bacterial species.基因流和基因渗入是塑造细菌物种进化的普遍力量。
Genome Biol. 2022 Nov 10;23(1):239. doi: 10.1186/s13059-022-02809-5.
BMC Evol Biol. 2018 Oct 12;18(1):153. doi: 10.1186/s12862-018-1272-4.
4
A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life.基于基因组系统发育的标准化细菌分类学极大地改变了生命之树。
Nat Biotechnol. 2018 Nov;36(10):996-1004. doi: 10.1038/nbt.4229. Epub 2018 Aug 27.
5
SonicParanoid: fast, accurate and easy orthology inference.SonicParanoid:快速、准确、易用的直系同源推断。
Bioinformatics. 2019 Jan 1;35(1):149-151. doi: 10.1093/bioinformatics/bty631.
6
Accurate prediction of orthologs in the presence of divergence after duplication.在复制后发生分歧的情况下准确预测直系同源物。
Bioinformatics. 2018 Jul 1;34(13):i366-i375. doi: 10.1093/bioinformatics/bty242.
7
Biological species are universal across Life's domains.生物物种在生命的各个领域中都是普遍存在的。
Genome Biol Evol. 2017 Feb 10;9(3):491-501. doi: 10.1093/gbe/evx026.
8
eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences.蛋nog 4.5:一个具有改进功能注释的层次同源框架,适用于真核、原核和病毒序列。
Nucleic Acids Res. 2016 Jan 4;44(D1):D286-93. doi: 10.1093/nar/gkv1248. Epub 2015 Nov 17.
9
Roary: rapid large-scale prokaryote pan genome analysis.Roary:快速大规模原核生物泛基因组分析
Bioinformatics. 2015 Nov 15;31(22):3691-3. doi: 10.1093/bioinformatics/btv421. Epub 2015 Jul 20.
10
Ten years of pan-genome analyses.泛基因组分析十年
Curr Opin Microbiol. 2015 Feb;23:148-54. doi: 10.1016/j.mib.2014.11.016. Epub 2014 Dec 5.