• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GraphPlas:利用组装图谱进行质粒序列的精细化分类。

GraphPlas: Refined Classification of Plasmid Sequences Using Assembly Graphs.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):57-67. doi: 10.1109/TCBB.2021.3082915. Epub 2022 Feb 3.

DOI:10.1109/TCBB.2021.3082915
PMID:34029192
Abstract

Plasmids are extra-chromosomal genetic materials with important markers that affect the function and behaviour of the microorganisms supporting their environmental adaptations. Hence the identification and recovery of such plasmid sequences from assemblies is a crucial task in metagenomics analysis. In the past, machine learning approaches have been developed to separate chromosomes and plasmids. However, there is always a compromise between precision and recall in the existing classification approaches. The similarity of compositions between chromosomes and their plasmids makes it difficult to separate plasmids and chromosomes with high accuracy. However, high confidence classifications are accurate with a significant compromise of recall, and vice versa. Hence, the requirement exists to have more sophisticated approaches to separate plasmids and chromosomes accurately while retaining an acceptable trade-off between precision and recall. We present GraphPlas, a novel approach for plasmid recovery using coverage, composition and assembly graph topology. We evaluated GraphPlas on simulated and real short read assemblies with varying compositions of plasmids and chromosomes. Our experiments show that GraphPlas is able to significantly improve accuracy in detecting plasmid and chromosomal contigs on top of popular state-of-the-art plasmid detection tools. The source code is freely available at: https://github.com/anuradhawick/GraphPlas.

摘要

质粒是带有重要标记的染色体外遗传物质,这些标记影响着支持微生物环境适应的功能和行为。因此,从组装体中识别和回收这些质粒序列是宏基因组分析中的一项关键任务。过去,已经开发了机器学习方法来分离染色体和质粒。然而,现有的分类方法在精度和召回率之间总是存在折衷。染色体和质粒之间组成的相似性使得很难用高精度将质粒和染色体分开。然而,高置信度分类的召回率有显著的折衷,反之亦然。因此,需要有更复杂的方法来准确分离质粒和染色体,同时在精度和召回率之间保持可接受的权衡。我们提出了 GraphPlas,这是一种使用覆盖范围、组成和组装图拓扑结构来回收质粒的新方法。我们在具有不同质粒和染色体组成的模拟和真实短读组装体上评估了 GraphPlas。我们的实验表明,GraphPlas 能够显著提高在流行的最先进的质粒检测工具的基础上检测质粒和染色体 contigs 的准确性。源代码可在:https://github.com/anuradhawick/GraphPlas 上获得。

相似文献

1
GraphPlas: Refined Classification of Plasmid Sequences Using Assembly Graphs.GraphPlas:利用组装图谱进行质粒序列的精细化分类。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):57-67. doi: 10.1109/TCBB.2021.3082915. Epub 2022 Feb 3.
2
3CAC: improving the classification of phages and plasmids in metagenomic assemblies using assembly graphs.3CAC:利用组装图提高宏基因组组装中噬菌体和质粒的分类。
Bioinformatics. 2022 Sep 16;38(Suppl_2):ii56-ii61. doi: 10.1093/bioinformatics/btac468.
3
PLASMe: a tool to identify PLASMid contigs from short-read assemblies using transformer.PLASMe:一种使用变压器从短读组装中识别 PLASMid 连续体的工具。
Nucleic Acids Res. 2023 Aug 25;51(15):e83. doi: 10.1093/nar/gkad578.
4
PlasClass improves plasmid sequence classification.PlasClass 可改善质粒序列分类。
PLoS Comput Biol. 2020 Apr 3;16(4):e1007781. doi: 10.1371/journal.pcbi.1007781. eCollection 2020 Apr.
5
PlasBin-flow: a flow-based MILP algorithm for plasmid contigs binning.PlasBin-flow:一种基于流量的质粒基因簇分箱 MILP 算法。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i288-i296. doi: 10.1093/bioinformatics/btad250.
6
HOTSPOT: hierarchical host prediction for assembled plasmid contigs with transformer.热点:使用 Transformer 进行组装质粒 contigs 的分层宿主预测。
Bioinformatics. 2023 May 4;39(5). doi: 10.1093/bioinformatics/btad283.
7
SCAPP: an algorithm for improved plasmid assembly in metagenomes.SCAPP:一种用于提高宏基因组中质粒组装的算法。
Microbiome. 2021 Jun 25;9(1):144. doi: 10.1186/s40168-021-01068-z.
8
Platon: identification and characterization of bacterial plasmid contigs in short-read draft assemblies exploiting protein sequence-based replicon distribution scores.柏拉图:利用基于蛋白质序列的复制子分布分数,在短读长草图组装中鉴定和描述细菌质粒片段。
Microb Genom. 2020 Oct;6(10). doi: 10.1099/mgen.0.000398.
9
gplas: a comprehensive tool for plasmid analysis using short-read graphs.gplas:一个使用短读序列图进行质粒分析的综合工具。
Bioinformatics. 2020 Jun 1;36(12):3874-3876. doi: 10.1093/bioinformatics/btaa233.
10
MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies.MOB-suite:用于从草图组装中对质粒进行聚类、重建和分型的软件工具。
Microb Genom. 2018 Aug;4(8). doi: 10.1099/mgen.0.000206. Epub 2018 Jul 27.

引用本文的文献

1
Plasmer: an Accurate and Sensitive Bacterial Plasmid Prediction Tool Based on Machine Learning of Shared k-mers and Genomic Features.Plasmer:一种基于共享 k-mers 和基因组特征的机器学习的准确且灵敏的细菌质粒预测工具。
Microbiol Spectr. 2023 Jun 15;11(3):e0464522. doi: 10.1128/spectrum.04645-22. Epub 2023 May 16.