• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

占据模型、最大连续长度概率和宏基因组学实验设计。

Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.

机构信息

Biological Sciences Division, University of Chicago, Chicago, Illinois, United States of America.

出版信息

PLoS One. 2010 Jul 29;5(7):e11652. doi: 10.1371/journal.pone.0011652.

DOI:10.1371/journal.pone.0011652
PMID:20686599
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2912229/
Abstract

Mathematical aspects of coverage and gaps in genome assembly have received substantial attention by bioinformaticians. Typical problems under consideration suppose that reads can be experimentally obtained from a single genome and that the number of reads will be set to cover a large percentage of that genome at a desired depth. In metagenomics experiments genomes from multiple species are simultaneously analyzed and obtaining large numbers of reads per genome is unlikely. We propose the probability of obtaining at least one contig of a desired minimum size from each novel genome in the pool without restriction based on depth of coverage as a metric for metagenomic experimental design. We derive an approximation to the distribution of maximum contig size for single genome assemblies using relatively few reads. This approximation is verified in simulation studies and applied to a number of different metagenomic experimental design problems, ranging in difficulty from detecting a single novel genome in a pool of known species to detecting each of a random number of novel genomes collectively sized and with abundances corresponding to given distributions in a single pool.

摘要

基因组组装的覆盖范围和缺口的数学方面受到了生物信息学家的广泛关注。典型的考虑问题是假设可以从单个基因组中实验获得读取,并且读取的数量将设置为以所需的深度覆盖该基因组的很大一部分。在宏基因组学实验中,同时分析多个物种的基因组,并且不太可能为每个基因组获得大量的读取。我们提出了一种基于覆盖深度的方法,在不限制的情况下,从池中每个新基因组中获得至少一个所需最小大小的连续体的概率作为宏基因组实验设计的度量标准。我们使用相对较少的读取来推导出单基因组组装的最大连续体大小的分布的近似值。该近似值在模拟研究中得到验证,并应用于许多不同的宏基因组实验设计问题,从检测池中的单个新基因组到检测单个池中随机数量的新基因组的集合,这些基因组的大小和丰度对应于给定的分布。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/cd36ebad9170/pone.0011652.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/eefda33384db/pone.0011652.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/6353d3b76a72/pone.0011652.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/efdb335b799c/pone.0011652.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/1f36b507c24d/pone.0011652.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/fdc931cda81a/pone.0011652.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/cd36ebad9170/pone.0011652.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/eefda33384db/pone.0011652.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/6353d3b76a72/pone.0011652.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/efdb335b799c/pone.0011652.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/1f36b507c24d/pone.0011652.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/fdc931cda81a/pone.0011652.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4d6d/2912229/cd36ebad9170/pone.0011652.g006.jpg

相似文献

1
Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.占据模型、最大连续长度概率和宏基因组学实验设计。
PLoS One. 2010 Jul 29;5(7):e11652. doi: 10.1371/journal.pone.0011652.
2
Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations.病毒宏基因组组装中的碎片化和覆盖度变化,及其对多样性计算的影响。
Front Bioeng Biotechnol. 2015 Sep 17;3:141. doi: 10.3389/fbioe.2015.00141. eCollection 2015.
3
Estimating DNA coverage and abundance in metagenomes using a gamma approximation.使用伽马近似法估计宏基因组中的 DNA 覆盖率和丰度。
Bioinformatics. 2010 Feb 1;26(3):295-301. doi: 10.1093/bioinformatics/btp687. Epub 2009 Dec 14.
4
Assessment of metagenomic assemblers based on hybrid reads of real and simulated metagenomic sequences.基于真实和模拟宏基因组序列混合读取的宏基因组组装器评估。
Brief Bioinform. 2020 May 21;21(3):777-790. doi: 10.1093/bib/bbz025.
5
LR_Gapcloser: a tiling path-based gap closer that uses long reads to complete genome assembly.LR_Gapcloser:一种基于平铺路径的缺口闭合器,它使用长读长来完成基因组组装。
Gigascience. 2019 Jan 1;8(1):giy157. doi: 10.1093/gigascience/giy157.
6
A comprehensive investigation of metagenome assembly by linked-read sequencing.基于链接读取测序的宏基因组组装综合研究。
Microbiome. 2020 Nov 11;8(1):156. doi: 10.1186/s40168-020-00929-3.
7
Accurate binning of metagenomic contigs via automated clustering sequences using information of genomic signatures and marker genes.通过利用基因组特征和标记基因信息对序列进行自动聚类,实现宏基因组重叠群的精确分类。
Sci Rep. 2016 Apr 12;6:24175. doi: 10.1038/srep24175.
8
CoMet: a workflow using contig coverage and composition for binning a metagenomic sample with high precision.CoMet:一种使用 contig 覆盖度和组成进行宏基因组样本高精度分箱的工作流程。
BMC Bioinformatics. 2017 Dec 28;18(Suppl 16):571. doi: 10.1186/s12859-017-1967-3.
9
Coverage theories for metagenomic DNA sequencing based on a generalization of Stevens' theorem.基于史蒂文斯定理推广的宏基因组DNA测序覆盖理论。
J Math Biol. 2013 Nov;67(5):1141-61. doi: 10.1007/s00285-012-0586-x. Epub 2012 Sep 11.
10
MetaCAA: A clustering-aided methodology for efficient assembly of metagenomic datasets.MetaCAA:一种用于宏基因组数据集高效组装的聚类辅助方法。
Genomics. 2014 Feb-Mar;103(2-3):161-8. doi: 10.1016/j.ygeno.2014.02.007. Epub 2014 Mar 5.

引用本文的文献

1
Terabase-scale metagenome coassembly with MetaHipMer.万亿级基因组组装规模的宏基因组 coassembly 与 MetaHipMer。
Sci Rep. 2020 Jul 1;10(1):10689. doi: 10.1038/s41598-020-67416-5.
2
Estimating the Optimum Coverage and Quality of Amplicon Sequencing With Taylor's Power Law Extensions.用泰勒幂律扩展估计扩增子测序的最佳覆盖度和质量
Front Bioeng Biotechnol. 2020 May 15;8:372. doi: 10.3389/fbioe.2020.00372. eCollection 2020.
3
Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity.无与伦比3:宏基因组覆盖度和序列多样性的快速估计

本文引用的文献

1
Metagenome analysis of an extreme microbial symbiosis reveals eurythermal adaptation and metabolic flexibility.极端微生物共生的宏基因组分析揭示了广温适应性和代谢灵活性。
Proc Natl Acad Sci U S A. 2008 Nov 11;105(45):17516-21. doi: 10.1073/pnas.0802782105. Epub 2008 Nov 5.
2
Genome signature analysis of thermal virus metagenomes reveals Archaea and thermophilic signatures.热病毒宏基因组的基因组特征分析揭示了古菌和嗜热特征。
BMC Genomics. 2008 Sep 17;9:420. doi: 10.1186/1471-2164-9-420.
3
Simultaneous assessment of soil microbial community structure and function through analysis of the meta-transcriptome.
mSystems. 2018 Apr 10;3(3). doi: 10.1128/mSystems.00039-18. eCollection 2018 May-Jun.
4
MetLab: An In Silico Experimental Design, Simulation and Analysis Tool for Viral Metagenomics Studies.MetLab:一种用于病毒宏基因组学研究的计算机模拟实验设计、模拟和分析工具。
PLoS One. 2016 Aug 1;11(8):e0160334. doi: 10.1371/journal.pone.0160334. eCollection 2016.
5
Estimating coverage in metagenomic data sets and why it matters.估计宏基因组数据集中的覆盖率及其重要性。
ISME J. 2014 Nov;8(11):2349-51. doi: 10.1038/ismej.2014.76. Epub 2014 May 13.
6
Coverage theories for metagenomic DNA sequencing based on a generalization of Stevens' theorem.基于史蒂文斯定理推广的宏基因组DNA测序覆盖理论。
J Math Biol. 2013 Nov;67(5):1141-61. doi: 10.1007/s00285-012-0586-x. Epub 2012 Sep 11.
7
Evaluation of high-throughput sequencing for identifying known and unknown viruses in biological samples.高通量测序技术在生物样本中已知和未知病毒鉴定中的评估。
J Clin Microbiol. 2011 Sep;49(9):3268-75. doi: 10.1128/JCM.00850-11. Epub 2011 Jun 29.
通过宏转录组分析同时评估土壤微生物群落结构和功能。
PLoS One. 2008 Jun 25;3(6):e2527. doi: 10.1371/journal.pone.0002527.
4
Aspects of coverage in medical DNA sequencing.医学DNA测序中的覆盖度方面
BMC Bioinformatics. 2008 May 16;9:239. doi: 10.1186/1471-2105-9-239.
5
Assembly of viral metagenomes from yellowstone hot springs.黄石热泉中病毒宏基因组的组装
Appl Environ Microbiol. 2008 Jul;74(13):4164-74. doi: 10.1128/AEM.02598-07. Epub 2008 Apr 25.
6
The human microbiome project.人类微生物组计划
Nature. 2007 Oct 18;449(7164):804-10. doi: 10.1038/nature06244.
7
Comparative metagenomics revealed commonly enriched gene sets in human gut microbiomes.比较宏基因组学揭示了人类肠道微生物群中常见的富集基因集。
DNA Res. 2007 Aug 31;14(4):169-81. doi: 10.1093/dnares/dsm018. Epub 2007 Oct 3.
8
Metagenomic and small-subunit rRNA analyses reveal the genetic diversity of bacteria, archaea, fungi, and viruses in soil.宏基因组学和小亚基核糖体RNA分析揭示了土壤中细菌、古菌、真菌和病毒的遗传多样性。
Appl Environ Microbiol. 2007 Nov;73(21):7059-66. doi: 10.1128/AEM.00358-07. Epub 2007 Sep 7.
9
Use of simulated data sets to evaluate the fidelity of metagenomic processing methods.使用模拟数据集评估宏基因组学处理方法的保真度。
Nat Methods. 2007 Jun;4(6):495-500. doi: 10.1038/nmeth1043. Epub 2007 Apr 29.
10
Viral abundance and genome size distribution in the sediment and water column of marine and freshwater ecosystems.海洋和淡水生态系统沉积物及水柱中的病毒丰度与基因组大小分布
FEMS Microbiol Ecol. 2007 Jun;60(3):397-410. doi: 10.1111/j.1574-6941.2007.00298.x. Epub 2007 Mar 28.