文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

Exploiting topic modeling to boost metagenomic reads binning.

作者信息

Zhang Ruichang, Cheng Zhanzhan, Guan Jihong, Zhou Shuigeng

出版信息

BMC Bioinformatics. 2015;16 Suppl 5(Suppl 5):S2. doi: 10.1186/1471-2105-16-S5-S2. Epub 2015 Mar 18.


DOI:10.1186/1471-2105-16-S5-S2
PMID:25859745
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4402587/
Abstract

BACKGROUND: With the rapid development of high-throughput technologies, researchers can sequence the whole metagenome of a microbial community sampled directly from the environment. The assignment of these metagenomic reads into different species or taxonomical classes is a vital step for metagenomic analysis, which is referred to as binning of metagenomic data. RESULTS: In this paper, we propose a new method TM-MCluster for binning metagenomic reads. First, we represent each metagenomic read as a set of "k-mers" with their frequencies occurring in the read. Then, we employ a probabilistic topic model -- the Latent Dirichlet Allocation (LDA) model to the reads, which generates a number of hidden "topics" such that each read can be represented by a distribution vector of the generated topics. Finally, as in the MCluster method, we apply SKWIC -- a variant of the classical K-means algorithm with automatic feature weighting mechanism to cluster these reads represented by topic distributions. CONCLUSIONS: Experiments show that the new method TM-MCluster outperforms major existing methods, including AbundanceBin, MetaCluster 3.0/5.0 and MCluster. This result indicates that the exploitation of topic modeling can effectively improve the binning performance of metagenomic reads.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/451b/4402587/2b67c19ed722/1471-2105-16-S5-S2-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/451b/4402587/574934a705b1/1471-2105-16-S5-S2-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/451b/4402587/38293366552a/1471-2105-16-S5-S2-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/451b/4402587/ca2ddac19f3b/1471-2105-16-S5-S2-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/451b/4402587/ad3dd9ae3028/1471-2105-16-S5-S2-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/451b/4402587/2b67c19ed722/1471-2105-16-S5-S2-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/451b/4402587/574934a705b1/1471-2105-16-S5-S2-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/451b/4402587/38293366552a/1471-2105-16-S5-S2-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/451b/4402587/ca2ddac19f3b/1471-2105-16-S5-S2-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/451b/4402587/ad3dd9ae3028/1471-2105-16-S5-S2-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/451b/4402587/2b67c19ed722/1471-2105-16-S5-S2-5.jpg

相似文献

[1]
Exploiting topic modeling to boost metagenomic reads binning.

BMC Bioinformatics. 2015

[2]
A New Unsupervised Binning Approach for Metagenomic Sequences Based on N-grams and Automatic Feature Weighting.

IEEE/ACM Trans Comput Biol Bioinform. 2014

[3]
MetaCluster-TA: taxonomic annotation for metagenomic data based on assembly-assisted binning.

BMC Genomics. 2014-1-24

[4]
Selection of marker genes for genetic barcoding of microorganisms and binning of metagenomic reads by Barcoder software tools.

BMC Bioinformatics. 2018-8-30

[5]
Metagenome Assembly and Contig Assignment.

Methods Mol Biol. 2018

[6]
AFITbin: a metagenomic contig binning method using aggregate l-mer frequency based on initial and terminal nucleotides.

BMC Bioinformatics. 2024-7-16

[7]
MetaProb: accurate metagenomic reads binning based on probabilistic sequence signatures.

Bioinformatics. 2016-9-1

[8]
COCACOLA: binning metagenomic contigs using sequence COmposition, read CoverAge, CO-alignment and paired-end read LinkAge.

Bioinformatics. 2017-3-15

[9]
MetaCluster 4.0: a novel binning algorithm for NGS reads and huge number of species.

J Comput Biol. 2012-2

[10]
Genome-resolved metagenomics using environmental and clinical samples.

Brief Bioinform. 2021-9-2

引用本文的文献

[1]
Decontaminating eukaryotic genome assemblies with machine learning.

BMC Bioinformatics. 2017-12-1

[2]
A new method for enhancer prediction based on deep belief network.

BMC Bioinformatics. 2017-10-16

[3]
MetaTopics: an integration tool to analyze microbial community profile by topic model.

BMC Genomics. 2017-1-25

[4]
An overview of topic modeling and its current applications in bioinformatics.

Springerplus. 2016-9-20

[5]
A novel procedure on next generation sequencing data analysis using text mining algorithm.

BMC Bioinformatics. 2016-5-13

本文引用的文献

[1]
A New Unsupervised Binning Approach for Metagenomic Sequences Based on N-grams and Automatic Feature Weighting.

IEEE/ACM Trans Comput Biol Bioinform. 2014

[2]
MetaCluster-TA: taxonomic annotation for metagenomic data based on assembly-assisted binning.

BMC Genomics. 2014-1-24

[3]
MetaCluster 5.0: a two-round binning approach for metagenomic data for low-abundance species in a noisy sample.

Bioinformatics. 2012-9-15

[4]
MetaCluster 4.0: a novel binning algorithm for NGS reads and huge number of species.

J Comput Biol. 2012-2

[5]
Exploiting the functional and taxonomic structure of genomic data by probabilistic topic modeling.

IEEE/ACM Trans Comput Biol Bioinform. 2012

[6]
A robust and accurate binning algorithm for metagenomic sequences with arbitrary species abundance ratio.

Bioinformatics. 2011-4-14

[7]
A novel abundance-based algorithm for binning metagenomic sequences using l-tuples.

J Comput Biol. 2011-3

[8]
MLTreeMap--accurate Maximum Likelihood placement of environmental DNA sequences into taxonomic and functional reference phylogenies.

BMC Genomics. 2010-8-5

[9]
A human gut microbial gene catalogue established by metagenomic sequencing.

Nature. 2010-3-4

[10]
Predicting protein-protein relationships from literature using latent topics.

Genome Inform. 2009-10

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索