• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于史蒂文斯定理推广的宏基因组DNA测序覆盖理论。

Coverage theories for metagenomic DNA sequencing based on a generalization of Stevens' theorem.

作者信息

Wendl Michael C, Kota Karthik, Weinstock George M, Mitreva Makedonka

机构信息

The Genome Institute, Washington University, St. Louis, MO, 63108, USA,

出版信息

J Math Biol. 2013 Nov;67(5):1141-61. doi: 10.1007/s00285-012-0586-x. Epub 2012 Sep 11.

DOI:10.1007/s00285-012-0586-x
PMID:22965653
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3795925/
Abstract

Metagenomic project design has relied variously upon speculation, semi-empirical and ad hoc heuristic models, and elementary extensions of single-sample Lander-Waterman expectation theory, all of which are demonstrably inadequate. Here, we propose an approach based upon a generalization of Stevens' Theorem for randomly covering a domain. We extend this result to account for the presence of multiple species, from which are derived useful probabilities for fully recovering a particular target microbe of interest and for average contig length. These show improved specificities compared to older measures and recommend deeper data generation than the levels chosen by some early studies, supporting the view that poor assemblies were due at least somewhat to insufficient data. We assess predictions empirically by generating roughly 4.5 Gb of sequence from a twelve member bacterial community, comparing coverage for two particular members, Selenomonas artemidis and Enterococcus faecium, which are the least ([Formula: see text]3 %) and most ([Formula: see text]12 %) abundant species, respectively. Agreement is reasonable, with differences likely attributable to coverage biases. We show that, in some cases, bias is simple in the sense that a small reduction in read length to simulate less efficient covering brings data and theory into essentially complete accord. Finally, we describe two applications of the theory. One plots coverage probability over the relevant parameter space, constructing essentially a "metagenomic design map" to enable straightforward analysis and design of future projects. The other gives an overview of the data requirements for various types of sequencing milestones, including a desired number of contact reads and contig length, for detection of a rare viral species.

摘要

宏基因组项目设计在不同程度上依赖于推测、半经验和临时启发式模型,以及单样本兰德-沃特曼期望理论的基本扩展,而所有这些方法显然都存在不足。在此,我们提出一种基于史蒂文斯定理推广的方法,用于随机覆盖一个区域。我们将这一结果进行扩展,以考虑多种物种的存在,从中推导出用于完全恢复特定目标微生物以及平均重叠群长度的有用概率。与以往的测量方法相比,这些概率显示出更高的特异性,并建议生成比一些早期研究选择的深度更深的数据,这支持了这样一种观点,即组装效果不佳至少在一定程度上是由于数据不足。我们通过从一个由12种细菌组成的群落中生成约4.5Gb的序列来实证评估预测结果,比较了两种特定成员(分别是丰度最低(约3%)的阿氏月形单胞菌和丰度最高(约12%)的粪肠球菌)的覆盖情况。二者吻合度合理,差异可能归因于覆盖偏差。我们表明,在某些情况下,偏差很简单,即通过小幅缩短读长来模拟效率较低的覆盖,能使数据与理论基本达成一致。最后,我们描述了该理论的两个应用。一个是在相关参数空间上绘制覆盖概率,本质上构建一个“宏基因组设计图”,以便对未来项目进行直接分析和设计。另一个是概述了实现各种测序里程碑(包括检测一种罕见病毒物种所需的接触读段数量和重叠群长度)的数据要求。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd54/3824951/29f2b26c9507/285_2012_586_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd54/3824951/ed26e5de3c75/285_2012_586_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd54/3824951/a99921ece227/285_2012_586_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd54/3824951/9ed4f0bc24a4/285_2012_586_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd54/3824951/12c866ca9e3e/285_2012_586_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd54/3824951/29f2b26c9507/285_2012_586_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd54/3824951/ed26e5de3c75/285_2012_586_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd54/3824951/a99921ece227/285_2012_586_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd54/3824951/9ed4f0bc24a4/285_2012_586_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd54/3824951/12c866ca9e3e/285_2012_586_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dd54/3824951/29f2b26c9507/285_2012_586_Fig5_HTML.jpg

相似文献

1
Coverage theories for metagenomic DNA sequencing based on a generalization of Stevens' theorem.基于史蒂文斯定理推广的宏基因组DNA测序覆盖理论。
J Math Biol. 2013 Nov;67(5):1141-61. doi: 10.1007/s00285-012-0586-x. Epub 2012 Sep 11.
2
Estimating the Optimum Coverage and Quality of Amplicon Sequencing With Taylor's Power Law Extensions.用泰勒幂律扩展估计扩增子测序的最佳覆盖度和质量
Front Bioeng Biotechnol. 2020 May 15;8:372. doi: 10.3389/fbioe.2020.00372. eCollection 2020.
3
Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome.优化来自粪肠球菌的下一代序列数据的混合组装:一种基因组高度分化的微生物。
BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S21. doi: 10.1186/1752-0509-6-S3-S21. Epub 2012 Dec 17.
4
Gene prediction in metagenomic fragments: a large scale machine learning approach.宏基因组片段中的基因预测:一种大规模机器学习方法。
BMC Bioinformatics. 2008 Apr 28;9:217. doi: 10.1186/1471-2105-9-217.
5
Direct comparisons of Illumina vs. Roche 454 sequencing technologies on the same microbial community DNA sample.Illumina 与 Roche 454 测序技术在同一微生物群落 DNA 样本上的直接比较。
PLoS One. 2012;7(2):e30087. doi: 10.1371/journal.pone.0030087. Epub 2012 Feb 10.
6
InteMAP: Integrated metagenomic assembly pipeline for NGS short reads.InteMAP:用于NGS短读长的综合宏基因组组装流程
BMC Bioinformatics. 2015 Aug 7;16:244. doi: 10.1186/s12859-015-0686-x.
7
Assessment of REPLI-g Multiple Displacement Whole Genome Amplification (WGA) Techniques for Metagenomic Applications.用于宏基因组学应用的REPLI-g多重置换全基因组扩增(WGA)技术评估
J Biomol Tech. 2017 Apr;28(1):46-55. doi: 10.7171/jbt.17-2801-008. Epub 2017 Mar 21.
8
MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach.环境宏基因组的MinION™纳米孔测序:一种合成方法。
Gigascience. 2017 Mar 1;6(3):1-10. doi: 10.1093/gigascience/gix007.
9
Assessment of metagenomic assembly using simulated next generation sequencing data.基于模拟下一代测序数据的宏基因组组装评估。
PLoS One. 2012;7(2):e31386. doi: 10.1371/journal.pone.0031386. Epub 2012 Feb 23.
10
Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations.病毒宏基因组组装中的碎片化和覆盖度变化,及其对多样性计算的影响。
Front Bioeng Biotechnol. 2015 Sep 17;3:141. doi: 10.3389/fbioe.2015.00141. eCollection 2015.

引用本文的文献

1
A theoretical and generalized approach for the assessment of the sample-specific limit of detection for clinical metagenomics.一种用于评估临床宏基因组学样本特异性检测限的理论化通用方法。
Comput Struct Biotechnol J. 2020 Dec 26;19:732-742. doi: 10.1016/j.csbj.2020.12.040. eCollection 2021.
2
Terabase-scale metagenome coassembly with MetaHipMer.万亿级基因组组装规模的宏基因组 coassembly 与 MetaHipMer。
Sci Rep. 2020 Jul 1;10(1):10689. doi: 10.1038/s41598-020-67416-5.
3
Estimating the Optimum Coverage and Quality of Amplicon Sequencing With Taylor's Power Law Extensions.

本文引用的文献

1
A comparison of rpoB and 16S rRNA as markers in pyrosequencing studies of bacterial diversity.rpoB 和 16S rRNA 作为细菌多样性焦磷酸测序研究标记的比较。
PLoS One. 2012;7(2):e30600. doi: 10.1371/journal.pone.0030600. Epub 2012 Feb 15.
2
Accurate genome relative abundance estimation based on shotgun metagenomic reads.基于高通量宏基因组测序reads 的精确基因组相对丰度估计
PLoS One. 2011;6(12):e27992. doi: 10.1371/journal.pone.0027992. Epub 2011 Dec 6.
3
Accurate and comprehensive sequencing of personal genomes.对个人基因组进行准确而全面的测序。
用泰勒幂律扩展估计扩增子测序的最佳覆盖度和质量
Front Bioeng Biotechnol. 2020 May 15;8:372. doi: 10.3389/fbioe.2020.00372. eCollection 2020.
4
Theoretical and Simulation-Based Investigation of the Relationship between Sequencing Effort, Microbial Community Richness, and Diversity in Binning Metagenome-Assembled Genomes.基于理论和模拟的分箱宏基因组组装基因组中测序工作量、微生物群落丰富度和多样性之间关系的研究
mSystems. 2019 Sep 17;4(5):e00384-19. doi: 10.1128/mSystems.00384-19.
5
Estimating the total genome length of a metagenomic sample using k-mers.利用 k- -mer 估算宏基因组样本的总基因组长度。
BMC Genomics. 2019 Apr 4;20(Suppl 2):183. doi: 10.1186/s12864-019-5467-x.
6
imGLAD: accurate detection and quantification of target organisms in metagenomes.imGLAD:宏基因组中目标生物的准确检测与定量分析
PeerJ. 2018 Nov 2;6:e5882. doi: 10.7717/peerj.5882. eCollection 2018.
7
Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity.无与伦比3:宏基因组覆盖度和序列多样性的快速估计
mSystems. 2018 Apr 10;3(3). doi: 10.1128/mSystems.00039-18. eCollection 2018 May-Jun.
8
Application of Taxonomic Modeling to Microbiota Data Mining for Detection of Helminth Infection in Global Populations.分类学建模在微生物群数据挖掘中的应用,用于检测全球人群中的蠕虫感染。
Data (Basel). 2016 Dec;1(3). doi: 10.3390/data1030019. Epub 2016 Dec 13.
9
MetLab: An In Silico Experimental Design, Simulation and Analysis Tool for Viral Metagenomics Studies.MetLab:一种用于病毒宏基因组学研究的计算机模拟实验设计、模拟和分析工具。
PLoS One. 2016 Aug 1;11(8):e0160334. doi: 10.1371/journal.pone.0160334. eCollection 2016.
10
Tracking Strains in the Microbiome: Insights from Metagenomics and Models.追踪微生物组中的菌株:宏基因组学与模型的见解
Front Microbiol. 2016 May 19;7:712. doi: 10.3389/fmicb.2016.00712. eCollection 2016.
Genome Res. 2011 Sep;21(9):1498-505. doi: 10.1101/gr.123638.111. Epub 2011 Jul 19.
4
Metagenomic discovery of biomass-degrading genes and genomes from cow rumen.从牛瘤胃中发现生物量降解基因和基因组的宏基因组学研究。
Science. 2011 Jan 28;331(6016):463-7. doi: 10.1126/science.1200387.
5
A map of human genome variation from population-scale sequencing.人类基因组变异的图谱来自于基于人群的测序。
Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.
6
Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.占据模型、最大连续长度概率和宏基因组学实验设计。
PLoS One. 2010 Jul 29;5(7):e11652. doi: 10.1371/journal.pone.0011652.
7
A human gut microbial gene catalogue established by metagenomic sequencing.宏基因组测序建立的人类肠道微生物基因目录。
Nature. 2010 Mar 4;464(7285):59-65. doi: 10.1038/nature08821.
8
A primer on metagenomics.元基因组学简介。
PLoS Comput Biol. 2010 Feb 26;6(2):e1000667. doi: 10.1371/journal.pcbi.1000667.
9
Estimating DNA coverage and abundance in metagenomes using a gamma approximation.使用伽马近似法估计宏基因组中的 DNA 覆盖率和丰度。
Bioinformatics. 2010 Feb 1;26(3):295-301. doi: 10.1093/bioinformatics/btp687. Epub 2009 Dec 14.
10
The theory of discovering rare variants via DNA sequencing.通过 DNA 测序发现稀有变异的理论。
BMC Genomics. 2009 Oct 20;10:485. doi: 10.1186/1471-2164-10-485.