• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对宏基因组读数的属/门分类的比对准确性进行基准测试。

Benchmarking blast accuracy of genus/phyla classification of metagenomic reads.

作者信息

Essinger Steven D, Rosen Gail L

机构信息

Electrical & Computer Engineering, Drexel University, 3141 Chestnut Street, Philadelphia, PA 19141, USA.

出版信息

Pac Symp Biocomput. 2010:10-20. doi: 10.1142/9789814295291_0003.

DOI:10.1142/9789814295291_0003
PMID:19908353
Abstract

Metagenomics is the study of environmental samples. Because few tools exist for metagenomic analysis, a natural step has been to utilize the popular homology tool, BLAST, to search for sequence similarity between sample fragments and an administered database. Most biologists use this method today without knowing BLAST's accuracy, especially when a particular taxonomic class is under-represented in the database. The aim of this paper is to benchmark the performance of BLAST for taxonomic classification of metagenomic datasets in a supervised setting; meaning that the database contains microbes of the same class as the 'unknown' query fragments. We examine well- and under-represented genera and phyla in order to study their effect on the accuracy of BLAST. We conclude that on fine-resolution classes, such as genera, the accuracy of BLAST does not degrade very much with under-representation, but in a highly variant class, such as phyla, performance degrades significantly. Our analysis includes five-fold cross validation to substantiate our findings.

摘要

宏基因组学是对环境样本的研究。由于用于宏基因组分析的工具很少,自然而然的一步就是利用流行的同源性工具BLAST,来搜索样本片段与管理数据库之间的序列相似性。如今,大多数生物学家在使用这种方法时并不了解BLAST的准确性,尤其是当特定的分类类别在数据库中代表性不足时。本文的目的是在有监督的环境下,对BLAST在宏基因组数据集分类中的性能进行基准测试;这意味着数据库包含与“未知”查询片段属于同一类别的微生物。我们研究了代表性良好和代表性不足的属和门,以研究它们对BLAST准确性的影响。我们得出结论,在精细分辨率的类别(如属)上,BLAST的准确性不会因代表性不足而大幅下降,但在高度变异的类别(如门)上,性能会显著下降。我们的分析包括五重交叉验证,以证实我们的发现。

相似文献

1
Benchmarking blast accuracy of genus/phyla classification of metagenomic reads.对宏基因组读数的属/门分类的比对准确性进行基准测试。
Pac Symp Biocomput. 2010:10-20. doi: 10.1142/9789814295291_0003.
2
SEPP: SATé-enabled phylogenetic placement.SEPP:基于SATé的系统发育定位
Pac Symp Biocomput. 2012:247-58. doi: 10.1142/9789814366496_0024.
3
Benchmarking Metagenomics Tools for Taxonomic Classification.基于元基因组工具的分类学基准测试。
Cell. 2019 Aug 8;178(4):779-794. doi: 10.1016/j.cell.2019.07.010.
4
Accurate taxonomic assignment of short pyrosequencing reads.对短焦磷酸测序读段进行准确的分类学归属
Pac Symp Biocomput. 2010:3-9. doi: 10.1142/9789814295291_0002.
5
Selection of marker genes for genetic barcoding of microorganisms and binning of metagenomic reads by Barcoder software tools.微生物遗传条形码标记基因的选择和 Barcoder 软件工具对宏基因组读段的分类。
BMC Bioinformatics. 2018 Aug 30;19(1):309. doi: 10.1186/s12859-018-2320-1.
6
GRASPx: efficient homolog-search of short peptide metagenome database through simultaneous alignment and assembly.GRASPx:通过同时比对和组装实现短肽宏基因组数据库的高效同源搜索
BMC Bioinformatics. 2016 Aug 31;17 Suppl 8(Suppl 8):283. doi: 10.1186/s12859-016-1119-1.
7
Resolving prokaryotic taxonomy without rRNA: longer oligonucleotide word lengths improve genome and metagenome taxonomic classification.不依赖 rRNA 解析原核分类学:更长的寡核苷酸字长可改善基因组和宏基因组的分类学分类。
PLoS One. 2013 Jul 1;8(7):e67337. doi: 10.1371/journal.pone.0067337. Print 2013.
8
CLAST: CUDA implemented large-scale alignment search tool.CLAST:基于CUDA实现的大规模比对搜索工具。
BMC Bioinformatics. 2014 Dec 11;15(1):406. doi: 10.1186/s12859-014-0406-y.
9
Comparative study of sequence aligners for detecting antibiotic resistance in bacterial metagenomes.用于检测细菌宏基因组中抗生素抗性的序列比对工具的比较研究
Lett Appl Microbiol. 2018 Mar;66(3):162-168. doi: 10.1111/lam.12842. Epub 2018 Feb 1.
10
Comparison of statistical methods to classify environmental genomic fragments.比较用于分类环境基因组片段的统计方法。
IEEE Trans Nanobioscience. 2010 Dec;9(4):310-6. doi: 10.1109/TNB.2010.2081375. Epub 2010 Sep 27.

引用本文的文献

1
Discovering the unknown: improving detection of novel species and genera from short reads.发现未知:提高从短读长中检测新物种和新属的能力。
J Biomed Biotechnol. 2011;2011:495849. doi: 10.1155/2011/495849. Epub 2011 Mar 23.