Suppr超能文献

RFPlasmid:使用机器学习从短读序列组装数据中预测质粒序列。

RFPlasmid: predicting plasmid sequences from short-read assembly data using machine learning.

机构信息

Faculty of Veterinary Medicine, Department of Infectious Diseases and Immunology, Utrecht University, Utrecht, The Netherlands.

WHO Collaborating Centre for Reference and Research on Campylobacter and Antimicrobial Resistance from an One Health Perspective/OIE Reference Laboratory for Campylobacteriosis, Utrecht, The Netherlands.

出版信息

Microb Genom. 2021 Nov;7(11). doi: 10.1099/mgen.0.000683.

Abstract

Antimicrobial-resistance (AMR) genes in bacteria are often carried on plasmids and these plasmids can transfer AMR genes between bacteria. For molecular epidemiology purposes and risk assessment, it is important to know whether the genes are located on highly transferable plasmids or in the more stable chromosomes. However, draft whole-genome sequences are fragmented, making it difficult to discriminate plasmid and chromosomal contigs. Current methods that predict plasmid sequences from draft genome sequences rely on single features, like -mer composition, circularity of the DNA molecule, copy number or sequence identity to plasmid replication genes, all of which have their drawbacks, especially when faced with large single-copy plasmids, which often carry resistance genes. With our newly developed prediction tool RFPlasmid, we use a combination of multiple features, including -mer composition and databases with plasmid and chromosomal marker proteins, to predict whether the likely source of a contig is plasmid or chromosomal. The tool RFPlasmid supports models for 17 different bacterial taxa, including , and , and has a taxon agnostic model for metagenomic assemblies or unsupported organisms. RFPlasmid is available both as a standalone tool and via a web interface.

摘要

细菌中的抗微生物药物耐药性 (AMR) 基因通常位于质粒上,这些质粒可以在细菌之间转移 AMR 基因。为了进行分子流行病学研究和风险评估,了解基因位于可高度转移的质粒上还是更稳定的染色体上非常重要。然而,草图全基因组序列是碎片化的,这使得区分质粒和染色体片段变得困难。目前,从草图基因组序列预测质粒序列的方法依赖于单一特征,如 -mer 组成、DNA 分子的环状结构、拷贝数或与质粒复制基因的序列同一性,所有这些都有其缺点,尤其是在面对经常携带耐药基因的大型单拷贝质粒时。我们新开发的预测工具 RFPlasmid 使用多种特征的组合,包括 -mer 组成和质粒和染色体标记蛋白数据库,来预测片段的可能来源是质粒还是染色体。该工具 RFPlasmid 支持包括 、 和 在内的 17 种不同细菌分类群的模型,并且具有针对宏基因组组装或未支持的生物体的分类群不可知模型。RFPlasmid 既可以作为独立工具使用,也可以通过网络界面使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ffad/8743549/1dc4f7c70ab6/mgen-7-0683-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验