文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

AfiTbin:一种基于初始和末端核苷酸的基于聚合 l-mer 频率的宏基因组序列拼接方法。

AFITbin: a metagenomic contig binning method using aggregate l-mer frequency based on initial and terminal nucleotides.

机构信息

Department of Computer and Data Sciences, Faculty of Mathematical Sciences, Shahid Beheshti University, Tehran, Iran.

School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.

出版信息

BMC Bioinformatics. 2024 Jul 16;25(1):241. doi: 10.1186/s12859-024-05859-7.


DOI:10.1186/s12859-024-05859-7
PMID:39014300
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11253361/
Abstract

BACKGROUND: Using next-generation sequencing technologies, scientists can sequence complex microbial communities directly from the environment. Significant insights into the structure, diversity, and ecology of microbial communities have resulted from the study of metagenomics. The assembly of reads into longer contigs, which are then binned into groups of contigs that correspond to different species in the metagenomic sample, is a crucial step in the analysis of metagenomics. It is necessary to organize these contigs into operational taxonomic units (OTUs) for further taxonomic profiling and functional analysis. For binning, which is synonymous with the clustering of OTUs, the tetra-nucleotide frequency (TNF) is typically utilized as a compositional feature for each OTU. RESULTS: In this paper, we present AFIT, a new l-mer statistic vector for each contig, and AFITBin, a novel method for metagenomic binning based on AFIT and a matrix factorization method. To evaluate the performance of the AFIT vector, the t-SNE algorithm is used to compare species clustering based on AFIT and TNF information. In addition, the efficacy of AFITBin is demonstrated on both simulated and real datasets in comparison to state-of-the-art binning methods such as MetaBAT 2, MaxBin 2.0, CONCOT, MetaCon, SolidBin, BusyBee Web, and MetaBinner. To further analyze the performance of the purposed AFIT vector, we compare the barcodes of the AFIT vector and the TNF vector. CONCLUSION: The results demonstrate that AFITBin shows superior performance in taxonomic identification compared to existing methods, leveraging the AFIT vector for improved results in metagenomic binning. This approach holds promise for advancing the analysis of metagenomic data, providing more reliable insights into microbial community composition and function. AVAILABILITY: A python package is available at: https://github.com/SayehSobhani/AFITBin .

摘要

背景:利用下一代测序技术,科学家可以直接从环境中对复杂的微生物群落进行测序。通过对宏基因组学的研究,人们对微生物群落的结构、多样性和生态学有了重要的认识。将reads 组装成长度更长的 contigs,然后将这些 contigs 分成对应于宏基因组样本中不同物种的 contigs 组,是宏基因组分析中的一个关键步骤。为了进一步进行分类学分析和功能分析,有必要将这些 contigs 组合成操作分类单元(OTUs)。对于 binning(即 OTUs 的聚类),通常使用四核苷酸频率(TNF)作为每个 OTU 的组成特征。

结果:在本文中,我们提出了一种新的 l-mer 统计向量 AFIT,用于每个 contig,以及一种新的基于 AFIT 和矩阵分解方法的宏基因组 binning 方法 AFITBin。为了评估 AFIT 向量的性能,我们使用 t-SNE 算法比较了基于 AFIT 和 TNF 信息的物种聚类。此外,我们将 AFITBin 方法与 MetaBAT 2、MaxBin 2.0、CONCOT、MetaCon、SolidBin、BusyBee Web 和 MetaBinner 等最新的 binning 方法在模拟和真实数据集上进行了比较,以验证其效果。为了进一步分析所提出的 AFIT 向量的性能,我们比较了 AFIT 向量和 TNF 向量的条形码。

结论:结果表明,与现有方法相比,AFITBin 在分类鉴定方面表现出更好的性能,利用 AFIT 向量可提高宏基因组 binning 的效果。这种方法有望推进宏基因组数据分析,为深入了解微生物群落的组成和功能提供更可靠的见解。

可用性:一个 python 包可在以下网址获得:https://github.com/SayehSobhani/AFITBin。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b3/11253361/803dd118b298/12859_2024_5859_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b3/11253361/1845b0681775/12859_2024_5859_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b3/11253361/d142518a2b7a/12859_2024_5859_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b3/11253361/70c40ae32ca6/12859_2024_5859_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b3/11253361/bd127efb3f8c/12859_2024_5859_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b3/11253361/803dd118b298/12859_2024_5859_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b3/11253361/1845b0681775/12859_2024_5859_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b3/11253361/d142518a2b7a/12859_2024_5859_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b3/11253361/70c40ae32ca6/12859_2024_5859_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b3/11253361/bd127efb3f8c/12859_2024_5859_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b3/11253361/803dd118b298/12859_2024_5859_Fig5_HTML.jpg

相似文献

[1]
AFITbin: a metagenomic contig binning method using aggregate l-mer frequency based on initial and terminal nucleotides.

BMC Bioinformatics. 2024-7-16

[2]
COCACOLA: binning metagenomic contigs using sequence COmposition, read CoverAge, CO-alignment and paired-end read LinkAge.

Bioinformatics. 2017-3-15

[3]
MetaCon: unsupervised clustering of metagenomic contigs with probabilistic k-mers statistics and coverage.

BMC Bioinformatics. 2019-11-22

[4]
SolidBin: improving metagenome binning with semi-supervised normalized cut.

Bioinformatics. 2019-11-1

[5]
CoMet: a workflow using contig coverage and composition for binning a metagenomic sample with high precision.

BMC Bioinformatics. 2017-12-28

[6]
GraphBin: refined binning of metagenomic contigs using assembly graphs.

Bioinformatics. 2020-6-1

[7]
Accurate Binning of Metagenomic Contigs Using Composition, Coverage, and Assembly Graphs.

J Comput Biol. 2022-12

[8]
METAMVGL: a multi-view graph-based metagenomic contig binning algorithm by integrating assembly and paired-end graphs.

BMC Bioinformatics. 2021-7-22

[9]
Improving contig binning of metagenomic data using [Formula: see text] oligonucleotide frequency dissimilarity.

BMC Bioinformatics. 2017-9-20

[10]
Binning metagenomic contigs by coverage and composition.

Nat Methods. 2014-9-14

本文引用的文献

[1]
NN-RNALoc: Neural network-based model for prediction of mRNA sub-cellular localization using distance-based sub-sequence profiles.

PLoS One. 2023

[2]
MetaBinner: a high-performance and stand-alone ensemble binning method to recover individual genomes from complex microbial communities.

Genome Biol. 2023-1-6

[3]
16S-FASAS: an integrated pipeline for synthetic full-length 16S rRNA gene sequencing data analysis.

PeerJ. 2022

[4]
A t-SNE Based Classification Approach to Compositional Microbiome Data.

Front Genet. 2020-12-14

[5]
GraphBin: refined binning of metagenomic contigs using assembly graphs.

Bioinformatics. 2020-6-1

[6]
MetaCon: unsupervised clustering of metagenomic contigs with probabilistic k-mers statistics and coverage.

BMC Bioinformatics. 2019-11-22

[7]
MetaBAT 2: an adaptive binning algorithm for robust and efficient genome reconstruction from metagenome assemblies.

PeerJ. 2019-7-26

[8]
SolidBin: improving metagenome binning with semi-supervised normalized cut.

Bioinformatics. 2019-11-1

[9]
MetaWRAP-a flexible pipeline for genome-resolved metagenomic data analysis.

Microbiome. 2018-9-15

[10]
Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.

Nat Methods. 2017-11

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索