• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

无注释的原核同源群描绘。

Annotation-free delineation of prokaryotic homology groups.

机构信息

Department of Computer Science, Rice University, Houston, Texas, United States of America.

Department of BioSciences, Rice University, Houston, Texas, United States of America.

出版信息

PLoS Comput Biol. 2022 Jun 8;18(6):e1010216. doi: 10.1371/journal.pcbi.1010216. eCollection 2022 Jun.

DOI:10.1371/journal.pcbi.1010216
PMID:35675326
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9212150/
Abstract

Phylogenomic studies of prokaryotic taxa often assume conserved marker genes are homologous across their length. However, processes such as horizontal gene transfer or gene duplication and loss may disrupt this homology by recombining only parts of genes, causing gene fission or fusion. We show using simulation that it is necessary to delineate homology groups in a set of bacterial genomes without relying on gene annotations to define the boundaries of homologous regions. To solve this problem, we have developed a graph-based algorithm to partition a set of bacterial genomes into Maximal Homologous Groups of sequences (MHGs) where each MHG is a maximal set of maximum-length sequences which are homologous across the entire sequence alignment. We applied our algorithm to a dataset of 19 Enterobacteriaceae species and found that MHGs cover much greater proportions of genomes than markers and, relatedly, are less biased in terms of the functions of the genes they cover. We zoomed in on the correlation between each individual marker and their overlapping MHGs, and show that few phylogenetic splits supported by the markers are supported by the MHGs while many marker-supported splits are contradicted by the MHGs. A comparison of the species tree inferred from marker genes with the species tree inferred from MHGs suggests that the increased bias and lack of genome coverage by markers causes incorrect inferences as to the overall relationship between bacterial taxa.

摘要

对原核生物类群的系统基因组学研究通常假设保守的标记基因在其全长范围内是同源的。然而,水平基因转移或基因复制和丢失等过程可能通过仅重组基因的部分来破坏这种同源性,从而导致基因分裂或融合。我们通过模拟表明,有必要在不依赖基因注释来定义同源区域边界的情况下,在一组细菌基因组中划定同源性组。为了解决这个问题,我们开发了一种基于图的算法,将一组细菌基因组划分为最大同源序列组(Maximal Homologous Groups of sequences,MHGs),其中每个 MHG 是一组最大长度序列的最大集合,这些序列在整个序列比对中是同源的。我们将我们的算法应用于 19 种肠杆菌科物种的数据集,发现 MHGs 覆盖了基因组的更大比例,与标记物相比,它们的功能覆盖范围也不那么偏向。我们放大了每个单独标记与其重叠 MHG 之间的相关性,并表明,标记物支持的少数系统发育分裂得到了 MHG 的支持,而许多标记物支持的分裂则与 MHG 相矛盾。与从 MHGs 推断的物种树相比,从标记基因推断的物种树表明,标记物的增加偏差和缺乏基因组覆盖导致了对细菌类群之间整体关系的不正确推断。

相似文献

1
Annotation-free delineation of prokaryotic homology groups.无注释的原核同源群描绘。
PLoS Comput Biol. 2022 Jun 8;18(6):e1010216. doi: 10.1371/journal.pcbi.1010216. eCollection 2022 Jun.
2
Genome trees constructed using five different approaches suggest new major bacterial clades.使用五种不同方法构建的基因组树表明了新的主要细菌进化枝。
BMC Evol Biol. 2001 Oct 20;1:8. doi: 10.1186/1471-2148-1-8.
3
Highways of gene sharing in prokaryotes.原核生物中的基因共享途径。
Proc Natl Acad Sci U S A. 2005 Oct 4;102(40):14332-7. doi: 10.1073/pnas.0504068102. Epub 2005 Sep 21.
4
A coarse-graining, ultrametric approach to resolve the phylogeny of prokaryotic strains with frequent homologous recombination.一种粗粒化的、超度量的方法,用于解决具有频繁同源重组的原核菌株的系统发育关系。
BMC Evol Biol. 2020 May 7;20(1):52. doi: 10.1186/s12862-020-01616-5.
5
GO4genome: a prokaryotic phylogeny based on genome organization.GO4genome:一种基于基因组组织的原核生物系统发育。
J Mol Evol. 2009 May;68(5):550-62. doi: 10.1007/s00239-009-9233-6. Epub 2009 May 13.
6
7
Horizontal Gene Transfer Building Prokaryote Genomes: Genes Related to Exchange Between Cell and Environment are Frequently Transferred.水平基因转移构建原核生物基因组:与细胞和环境之间交换有关的基因经常被转移。
J Mol Evol. 2018 Apr;86(3-4):190-203. doi: 10.1007/s00239-018-9836-x. Epub 2018 Mar 19.
8
Distant horizontal gene transfer is rare for multiple families of prokaryotic insertion sequences.对于多个原核生物插入序列家族而言,远距离水平基因转移是罕见的。
Mol Genet Genomics. 2008 Nov;280(5):397-408. doi: 10.1007/s00438-008-0373-y. Epub 2008 Aug 28.
9
A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm.一个使用黑马算法识别出的古菌和细菌基因组中系统发育非典型基因的数据库。
BMC Bioinformatics. 2008 Oct 7;9:419. doi: 10.1186/1471-2105-9-419.
10
Does a tree-like phylogeny only exist at the tips in the prokaryotes?树状系统发育是否仅存在于原核生物的末端?
Proc Biol Sci. 2004 Dec 22;271(1557):2551-8. doi: 10.1098/rspb.2004.2864.

本文引用的文献

1
SpeciesRax: A Tool for Maximum Likelihood Species Tree Inference from Gene Family Trees under Duplication, Transfer, and Loss.SpeciesRax:一种用于在基因家族树中进行复制、转移和丢失的最大似然种系发生树推断的工具。
Mol Biol Evol. 2022 Feb 3;39(2). doi: 10.1093/molbev/msab365.
2
Variational inference using approximate likelihood under the coalescent with recombination.使用重组下合并近似似然的变分推断。
Genome Res. 2021 Nov;31(11):2107-2119. doi: 10.1101/gr.273631.120. Epub 2021 Aug 23.
3
Defining Coalescent Genes: Theory Meets Practice in Organelle Phylogenomics.
定义融合基因:细胞器系统发生基因组学中的理论与实践。
Syst Biol. 2022 Feb 10;71(2):476-489. doi: 10.1093/sysbio/syab053.
4
Whole genome phylogenies reflect the distributions of recombination rates for many bacterial species.全基因组系统发育反映了许多细菌物种的重组率分布。
Elife. 2021 Jan 8;10:e65366. doi: 10.7554/eLife.65366.
5
Scalable multiple whole-genome alignment and locally collinear block construction with SibeliaZ.使用 SibeliaZ 进行可扩展的多基因组全序列比对和局部共线性块构建。
Nat Commun. 2020 Dec 10;11(1):6327. doi: 10.1038/s41467-020-19777-8.
6
UniProt: the universal protein knowledgebase in 2021.UniProt:2021 年的通用蛋白质知识库。
Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489. doi: 10.1093/nar/gkaa1100.
7
The Most Frequently Used Sequencing Technologies and Assembly Methods in Different Time Segments of the Bacterial Surveillance and RefSeq Genome Databases.细菌监测和 RefSeq 基因组数据库不同时间段中使用最频繁的测序技术和组装方法。
Front Cell Infect Microbiol. 2020 Oct 19;10:527102. doi: 10.3389/fcimb.2020.527102. eCollection 2020.
8
Progressive Cactus is a multiple-genome aligner for the thousand-genome era.渐进仙人掌是一个适用于千基因组时代的多基因组比对工具。
Nature. 2020 Nov;587(7833):246-251. doi: 10.1038/s41586-020-2871-y. Epub 2020 Nov 11.
9
ASTRAL-Pro: Quartet-Based Species-Tree Inference despite Paralogy.ASTRAL-Pro:基于四重奏的系统发生树推断,即便存在基因重复。
Mol Biol Evol. 2020 Nov 1;37(11):3292-3307. doi: 10.1093/molbev/msaa139.
10
Automated generation of gene summaries at the Alliance of Genome Resources.基因组资源联盟的基因摘要自动生成
Database (Oxford). 2020 Jan 1;2020. doi: 10.1093/database/baaa037.