文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

RefSeq 数据库的增长影响了基于 k-mer 的最低共同祖先物种鉴定的准确性。

RefSeq database growth influences the accuracy of k-mer-based lowest common ancestor species identification.

机构信息

Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, USA.

Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, Bethesda, MD, USA.

出版信息

Genome Biol. 2018 Oct 30;19(1):165. doi: 10.1186/s13059-018-1554-6.


DOI:10.1186/s13059-018-1554-6
PMID:30373669
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6206640/
Abstract

In order to determine the role of the database in taxonomic sequence classification, we examine the influence of the database over time on k-mer-based lowest common ancestor taxonomic classification. We present three major findings: the number of new species added to the NCBI RefSeq database greatly outpaces the number of new genera; as a result, more reads are classified with newer database versions, but fewer are classified at the species level; and Bayesian-based re-estimation mitigates this effect but struggles with novel genomes. These results suggest a need for new classification approaches specially adapted for large databases.

摘要

为了确定数据库在分类序列分类中的作用,我们考察了数据库随时间的变化对基于 k-mer 的最小编辑共同祖先分类的影响。我们得出了三个主要发现:添加到 NCBI RefSeq 数据库中的新物种数量大大超过了新属的数量;因此,更多的读取内容可以使用更新的数据库版本进行分类,但在物种水平上的分类却更少;基于贝叶斯的重新估计可以缓解这种影响,但对于新的基因组则较为困难。这些结果表明,需要专门针对大型数据库开发新的分类方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03fc/6206640/2f215c542f4a/13059_2018_1554_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03fc/6206640/e1dd687ea002/13059_2018_1554_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03fc/6206640/f1755d6e58f6/13059_2018_1554_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03fc/6206640/2f215c542f4a/13059_2018_1554_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03fc/6206640/e1dd687ea002/13059_2018_1554_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03fc/6206640/f1755d6e58f6/13059_2018_1554_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/03fc/6206640/2f215c542f4a/13059_2018_1554_Fig5_HTML.jpg

相似文献

[1]
RefSeq database growth influences the accuracy of k-mer-based lowest common ancestor species identification.

Genome Biol. 2018-10-30

[2]
Comparison of different assembly and annotation tools on analysis of simulated viral metagenomic communities in the gut.

BMC Genomics. 2014-1-18

[3]
CONSULT-II: accurate taxonomic identification and profiling using locality-sensitive hashing.

Bioinformatics. 2024-3-29

[4]
Assessment of k-mer spectrum applicability for metagenomic dissimilarity analysis.

BMC Bioinformatics. 2016-1-16

[5]
Investigating the impact of database choice on the accuracy of metagenomic read classification for the rumen microbiome.

Anim Microbiome. 2022-11-18

[6]
A Bayesian taxonomic classification method for 16S rRNA gene sequences with improved species-level accuracy.

BMC Bioinformatics. 2017-5-10

[7]
Memory-bound -mer selection for large and evolutionarily diverse reference libraries.

Genome Res. 2024-10-11

[8]
Large-scale machine learning for metagenomics sequence classification.

Bioinformatics. 2016-4-1

[9]
Strain-level metagenomic assignment and compositional estimation for long reads with MetaMaps.

Nat Commun. 2019-7-11

[10]
ganon: precise metagenomics classification against large and up-to-date sets of reference sequences.

Bioinformatics. 2020-7-1

引用本文的文献

[1]
ganon2: up-to-date and scalable metagenomics analysis.

NAR Genom Bioinform. 2025-7-17

[2]
Citywide metagenomic surveillance of food centres reveals local microbial signatures and antibiotic resistance gene enrichment.

NPJ Antimicrob Resist. 2025-7-8

[3]
SetBERT: the deep learning platform for contextualized embeddings and explainable predictions from high-throughput sequencing.

Bioinformatics. 2025-7-1

[4]
Poplar: a phylogenomics pipeline.

Bioinform Adv. 2025-5-6

[5]
Movi Color: fast and accurate long-read classification with the move structure.

bioRxiv. 2025-5-27

[6]
Precise and scalable metagenomic profiling with sample-tailored minimizer libraries.

NAR Genom Bioinform. 2025-6-9

[7]
From air to insight: the evolution of airborne DNA sequencing technologies.

Microbiology (Reading). 2025-5

[8]
OCTOPUS: Disk-based, Multiplatform, Mobile-friendly Metagenomics Classifier.

AMIA Annu Symp Proc. 2025-5-22

[9]
mKmer: an unbiased K-mer embedding of microbiomic single-microbe RNA sequencing data.

Brief Bioinform. 2025-5-1

[10]
NGS-based Aspergillus detection in plasma and lung lavage of children with invasive pulmonary aspergillosis.

NPJ Genom Med. 2025-3-17

本文引用的文献

[1]
KrakenUniq: confident and fast metagenomics classification using unique k-mer counts.

Genome Biol. 2018-11-16

[2]
A standardized bacterial taxonomy based on genome phylogeny substantially revises the tree of life.

Nat Biotechnol. 2018-8-27

[3]
Taxonomy annotation and guide tree errors in 16S rRNA databases.

PeerJ. 2018-6-12

[4]
ReprDB and panDB: minimalist databases with maximal microbial representation.

Microbiome. 2018-1-18

[5]
VecScreen_plus_taxonomy: imposing a tax(onomy) increase on vector contamination screening.

Bioinformatics. 2018-3-1

[6]
Critical Assessment of Metagenome Interpretation-a benchmark of metagenomics software.

Nat Methods. 2017-11

[7]
Comprehensive benchmarking and ensemble approaches for metagenomic classifiers.

Genome Biol. 2017-9-21

[8]
Draft Genome Sequences from a Novel Clade of Strains, Isolated from the International Space Station.

Genome Announc. 2017-8-10

[9]
Metagenomes of Soil Samples from an Established Perennial Cropping System of Asparagus Treated with Biostimulants in Southern France.

Genome Announc. 2017-6-15

[10]
The Public Health Impact of a Publically Available, Environmental Database of Microbial Genomes.

Front Microbiol. 2017-5-9

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索