Suppr超能文献

cBar:一种用于区分宏基因组数据中质粒衍生和染色体衍生序列片段的计算机程序。

cBar: a computer program to distinguish plasmid-derived from chromosome-derived sequence fragments in metagenomics data.

机构信息

Department of Biochemistry and Molecular Biology, University of Georgia, Athens, GA 30602, USA.

出版信息

Bioinformatics. 2010 Aug 15;26(16):2051-2. doi: 10.1093/bioinformatics/btq299. Epub 2010 Jun 10.

Abstract

SUMMARY

Huge amount of metagenomic sequence data have been produced as a result of the rapidly increasing efforts worldwide in studying microbial communities as a whole. Most, if not all, sequenced metagenomes are complex mixtures of chromosomal and plasmid sequence fragments from multiple organisms, possibly from different kingdoms. Computational methods for prediction of genomic elements such as genes are significantly different for chromosomes and plasmids, hence raising the need for separation of chromosomal from plasmid sequences in a metagenome. We present a program for classification of a metagenome set into chromosomal and plasmid sequences, based on their distinguishing pentamer frequencies. On a large training set consisting of all the sequenced prokaryotic chromosomes and plasmids, the program achieves approximately 92% in classification accuracy. On a large set of simulated metagenomes with sequence lengths ranging from 300 bp to 100 kbp, the program has classification accuracy from 64.45% to 88.75%. On a large independent test set, the program achieves 88.29% classification accuracy.

AVAILABILITY

The program has been implemented as a standalone prediction program, cBar, which is available at http://csbl.bmb.uga.edu/~ffzhou/cBar.

摘要

摘要

由于全球范围内对微生物群落进行整体研究的努力迅速增加,产生了大量的宏基因组序列数据。如果不是所有的话,那么大多数测序的宏基因组都是来自多个生物体(可能来自不同的生物界)的染色体和质粒序列片段的复杂混合物。用于预测基因等基因组元件的计算方法对于染色体和质粒有很大的不同,因此需要将宏基因组中的染色体与质粒序列分离。我们提出了一种基于五聚体频率区分的宏基因组分类程序,用于将宏基因组集分类为染色体和质粒序列。在由所有已测序的原核染色体和质粒组成的大型训练集上,该程序的分类准确性约为 92%。在包含长度为 300bp 至 100kbp 的序列的大型模拟宏基因组集上,该程序的分类准确性为 64.45%至 88.75%。在大型独立测试集上,该程序的分类准确性达到 88.29%。

可用性

该程序已实现为一个独立的预测程序 cBar,可在 http://csbl.bmb.uga.edu/~ffzhou/cBar 上获得。

相似文献

10
Comparison of de-novo assembly tools for plasmid metagenome analysis.比较用于质粒宏基因组分析的从头组装工具。
Genes Genomics. 2019 Sep;41(9):1077-1083. doi: 10.1007/s13258-019-00839-1. Epub 2019 Jun 11.

引用本文的文献

本文引用的文献

4
Barcodes for genomes and applications.基因组条形码及其应用。
BMC Bioinformatics. 2008 Dec 17;9:546. doi: 10.1186/1471-2105-9-546.
8
Data mining in bioinformatics using Weka.使用Weka进行生物信息学中的数据挖掘。
Bioinformatics. 2004 Oct 12;20(15):2479-81. doi: 10.1093/bioinformatics/bth261. Epub 2004 Apr 8.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验