• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

测序错误对宏基因组基因预测的影响。

The effect of sequencing errors on metagenomic gene prediction.

机构信息

Department of Bioinformatics, Institute of Microbiology and Genetics, Georg-August-University Göttingen, Göttingen, Germany.

出版信息

BMC Genomics. 2009 Nov 12;10:520. doi: 10.1186/1471-2164-10-520.

DOI:10.1186/1471-2164-10-520
PMID:19909532
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2781827/
Abstract

BACKGROUND

Gene prediction is an essential step in the annotation of metagenomic sequencing reads. Since most metagenomic reads cannot be assembled into long contigs, specialized statistical gene prediction tools have been developed for short and anonymous DNA fragments, e.g. MetaGeneAnnotator and Orphelia. While conventional gene prediction methods have been subject to a benchmark study on real sequencing reads with typical errors, such a comparison has not been conducted for specialized tools, yet. Their gene prediction accuracy was mostly measured on error free DNA fragments.

RESULTS

In this study, Sanger and pyrosequencing reads were simulated on the basis of models that take all types of sequencing errors into account. All metagenomic gene prediction tools showed decreasing accuracy with increasing sequencing error rates. Performance results on an established metagenomic benchmark dataset are also reported. In addition, we demonstrate that ESTScan, a tool for sequencing error compensation in eukaryotic expressed sequence tags, outperforms some metagenomic gene prediction tools on reads with high error rates although it was not designed for the task at hand.

CONCLUSION

This study fills an important gap in metagenomic gene prediction research. Specialized methods are evaluated and compared with respect to sequencing error robustness. Results indicate that the integration of error-compensating methods into metagenomic gene prediction tools would be beneficial to improve metagenome annotation quality.

摘要

背景

基因预测是对宏基因组测序reads 进行注释的重要步骤。由于大多数宏基因组reads 无法组装成长的连续序列,因此专门开发了统计基因预测工具来处理短的、匿名的 DNA 片段,例如 MetaGeneAnnotator 和 Orphelia。虽然传统的基因预测方法已经在具有典型错误的真实测序reads 上进行了基准测试,但尚未对专门的工具进行此类比较。它们的基因预测准确性主要是在没有错误的 DNA 片段上进行测量的。

结果

在这项研究中,根据考虑了所有类型测序错误的模型,对 Sanger 和焦磷酸测序reads 进行了模拟。所有宏基因组基因预测工具的准确性都随着测序错误率的增加而降低。还报告了在已建立的宏基因组基准数据集上的性能结果。此外,我们证明了 ESTScan,一种用于补偿真核表达序列标签中测序错误的工具,尽管它不是为此任务设计的,但在高错误率的reads 上优于某些宏基因组基因预测工具。

结论

这项研究填补了宏基因组基因预测研究中的一个重要空白。专门的方法针对测序错误稳健性进行了评估和比较。结果表明,将纠错方法集成到宏基因组基因预测工具中有助于提高宏基因组注释的质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f29/2781827/aa632cc4406b/1471-2164-10-520-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f29/2781827/2229edf13f00/1471-2164-10-520-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f29/2781827/aa632cc4406b/1471-2164-10-520-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f29/2781827/2229edf13f00/1471-2164-10-520-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7f29/2781827/aa632cc4406b/1471-2164-10-520-2.jpg

相似文献

1
The effect of sequencing errors on metagenomic gene prediction.测序错误对宏基因组基因预测的影响。
BMC Genomics. 2009 Nov 12;10:520. doi: 10.1186/1471-2164-10-520.
2
Short-read reading-frame predictors are not created equal: sequence error causes loss of signal.短读阅读框预测器并不相同:序列错误导致信号丢失。
BMC Bioinformatics. 2012 Jul 28;13:183. doi: 10.1186/1471-2105-13-183.
3
Orphelia: predicting genes in metagenomic sequencing reads.奥菲莉亚:宏基因组测序读段中的基因预测
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W101-5. doi: 10.1093/nar/gkp327. Epub 2009 May 8.
4
FragGeneScan: predicting genes in short and error-prone reads.FragGeneScan:预测短读和易错读中的基因。
Nucleic Acids Res. 2010 Nov;38(20):e191. doi: 10.1093/nar/gkq747. Epub 2010 Aug 30.
5
Combining gene prediction methods to improve metagenomic gene annotation.结合基因预测方法以提高宏基因组基因注释。
BMC Bioinformatics. 2011 Jan 13;12:20. doi: 10.1186/1471-2105-12-20.
6
Benchmarking of gene prediction programs for metagenomic data.宏基因组数据基因预测程序的基准测试。
Annu Int Conf IEEE Eng Med Biol Soc. 2010;2010:6190-3. doi: 10.1109/IEMBS.2010.5627744.
7
From Gene Annotation to Function Prediction for Metagenomics.从宏基因组学的基因注释到功能预测
Methods Mol Biol. 2017;1611:27-34. doi: 10.1007/978-1-4939-7015-5_3.
8
MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach.环境宏基因组的MinION™纳米孔测序:一种合成方法。
Gigascience. 2017 Mar 1;6(3):1-10. doi: 10.1093/gigascience/gix007.
9
HMM-FRAME: accurate protein domain classification for metagenomic sequences containing frameshift errors.HMM-FRAME:用于分类含有移码错误的宏基因组序列的蛋白质结构域。
BMC Bioinformatics. 2011 May 24;12:198. doi: 10.1186/1471-2105-12-198.
10
MetaCAA: A clustering-aided methodology for efficient assembly of metagenomic datasets.MetaCAA:一种用于宏基因组数据集高效组装的聚类辅助方法。
Genomics. 2014 Feb-Mar;103(2-3):161-8. doi: 10.1016/j.ygeno.2014.02.007. Epub 2014 Mar 5.

引用本文的文献

1
Capturing clinically relevant attributes through direct whole genome sequencing of stool.通过对粪便进行直接全基因组测序来捕获具有临床相关性的特征。
Microb Genom. 2024 Aug;10(8). doi: 10.1099/mgen.0.001284.
2
Clinical Metagenomic Next-Generation Sequencing for Diagnosis of Central Nervous System Infections: Advances and Challenges.临床宏基因组下一代测序在中枢神经系统感染诊断中的应用:进展与挑战。
Mol Diagn Ther. 2024 Sep;28(5):513-523. doi: 10.1007/s40291-024-00727-9. Epub 2024 Jul 11.
3
Enhancing Clinical Utility: Utilization of International Standards and Guidelines for Metagenomic Sequencing in Infectious Disease Diagnosis.

本文引用的文献

1
UFO: a web server for ultra-fast functional profiling of whole genome protein sequences.UFO:用于全基因组蛋白质序列超快速功能分析的网络服务器。
BMC Genomics. 2009 Sep 2;10:409. doi: 10.1186/1471-2164-10-409.
2
Orphelia: predicting genes in metagenomic sequencing reads.奥菲莉亚:宏基因组测序读段中的基因预测
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W101-5. doi: 10.1093/nar/gkp327. Epub 2009 May 8.
3
MetaGeneAnnotator: detecting species-specific patterns of ribosomal binding site for precise gene prediction in anonymous prokaryotic and phage genomes.
提高临床实用性:在传染病诊断中应用宏基因组测序的国际标准和指南。
Int J Mol Sci. 2024 Mar 15;25(6):3333. doi: 10.3390/ijms25063333.
4
Exploring microbial functional biodiversity at the protein family level-From metagenomic sequence reads to annotated protein clusters.在蛋白质家族水平上探索微生物功能多样性——从宏基因组序列 reads 到注释的蛋白质簇。
Front Bioinform. 2023 Mar 3;3:1157956. doi: 10.3389/fbinf.2023.1157956. eCollection 2023.
5
Methods to improve the accuracy of next-generation sequencing.提高下一代测序准确性的方法。
Front Bioeng Biotechnol. 2023 Jan 20;11:982111. doi: 10.3389/fbioe.2023.982111. eCollection 2023.
6
Evaluation of the correctable decoding sequencing as a new powerful strategy for DNA sequencing.评估可纠正解码测序作为一种新的强大的 DNA 测序策略。
Life Sci Alliance. 2022 Apr 14;5(8). doi: 10.26508/lsa.202101294. Print 2022 Aug.
7
Comparing mutational pathways to lopinavir resistance in HIV-1 subtypes B versus C.比较 HIV-1 亚型 B 与 C 中洛匹那韦耐药的突变途径。
PLoS Comput Biol. 2021 Sep 7;17(9):e1008363. doi: 10.1371/journal.pcbi.1008363. eCollection 2021 Sep.
8
Application of Proteomics Technologies in Oil Palm Research.蛋白质组学技术在油棕研究中的应用。
Protein J. 2018 Dec;37(6):473-499. doi: 10.1007/s10930-018-9802-x.
9
Genetic repertoires of anaerobic microbiomes driving generation of biogas.驱动沼气产生的厌氧微生物群落的基因库。
Biotechnol Biofuels. 2018 Sep 20;11:255. doi: 10.1186/s13068-018-1258-x. eCollection 2018.
10
Tackling critical parameters in metazoan meta-barcoding experiments: a preliminary study based on DNA barcode.解决后生动物元条形码实验中的关键参数:基于DNA条形码的初步研究
PeerJ. 2018 Jun 13;6:e4845. doi: 10.7717/peerj.4845. eCollection 2018.
MetaGeneAnnotator:检测核糖体结合位点的物种特异性模式,以在未知原核生物和噬菌体基因组中进行精确的基因预测。
DNA Res. 2008 Dec;15(6):387-96. doi: 10.1093/dnares/dsn027. Epub 2008 Oct 21.
4
MetaSim: a sequencing simulator for genomics and metagenomics.MetaSim:一款用于基因组学和宏基因组学的测序模拟器。
PLoS One. 2008 Oct 8;3(10):e3373. doi: 10.1371/journal.pone.0003373.
5
Detection of large numbers of novel sequences in the metatranscriptomes of complex marine microbial communities.在复杂海洋微生物群落的宏转录组中检测到大量新序列。
PLoS One. 2008 Aug 22;3(8):e3042. doi: 10.1371/journal.pone.0003042.
6
Millimeter-scale genetic gradients and community-level molecular convergence in a hypersaline microbial mat.高盐度微生物席中的毫米级遗传梯度和群落水平的分子趋同
Mol Syst Biol. 2008;4:198. doi: 10.1038/msb.2008.35. Epub 2008 Jun 3.
7
Gene prediction in metagenomic fragments: a large scale machine learning approach.宏基因组片段中的基因预测:一种大规模机器学习方法。
BMC Bioinformatics. 2008 Apr 28;9:217. doi: 10.1186/1471-2105-9-217.
8
Metagenomics: read length matters.宏基因组学:读长很重要。
Appl Environ Microbiol. 2008 Mar;74(5):1453-63. doi: 10.1128/AEM.02181-07. Epub 2008 Jan 11.
9
Metagenomic characterization of Chesapeake Bay virioplankton.切萨皮克湾病毒浮游生物的宏基因组特征分析
Appl Environ Microbiol. 2007 Dec;73(23):7629-41. doi: 10.1128/AEM.00938-07. Epub 2007 Oct 5.
10
Quantitative assessment of protein function prediction from metagenomics shotgun sequences.基于宏基因组鸟枪法序列的蛋白质功能预测定量评估
Proc Natl Acad Sci U S A. 2007 Aug 28;104(35):13913-8. doi: 10.1073/pnas.0702636104. Epub 2007 Aug 23.