• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于鸟枪法DNA测序的通用覆盖理论。

A general coverage theory for shotgun DNA sequencing.

作者信息

Wendl Michael C

机构信息

Genome Sequencing Center, Washington University, St. Louis, Missouri 63108, USA.

出版信息

J Comput Biol. 2006 Jul-Aug;13(6):1177-96. doi: 10.1089/cmb.2006.13.1177.

DOI:10.1089/cmb.2006.13.1177
PMID:16901236
Abstract

The classical theory of shotgun DNA sequencing accounts for neither the placement dependencies that are a fundamental consequence of the forward-reverse sequencing strategy, nor the edge effect that arises for small to moderate-sized genomic targets. These phenomena are relevant to a number of sequencing scenarios, including large-insert BAC and fosmid clones, filtered genomic libraries, and macro-nuclear chromosomes. Here, we report a model that considers these two effects and provides both the expected value of coverage and its variance. Comparison to methyl-filtered maize data shows significant improvement over classical theory. The model is used to analyze coverage performance over a range of small to moderately-sized genomic targets. We find that the read pairing effect and the edge effect interact in a non-trivial fashion. Shorter reads give superior coverage per unit sequence depth relative to longer ones. In principle, end-sequences can be optimized with respect to template insert length; however, optimal performance is unlikely to be realized in most cases because of inherent size variation in any set of targets. Conversely, single-stranded reads exhibit roughly the same coverage attributes as optimized end-reads. Although linking information is lost, single-stranded data should not pose a significant assembly liability if the target represents predominantly low-copy sequence. We also find that random sequencing should be halted at substantially lower redundancies than those now associated with larger projects. Given the enormous amount of data generated per cycle on pyro-sequencing instruments, this observation suggests devising schemes to split each run cycle between twoor more projects. This would prevent over-sequencing and would further leverage the pyrosequencing method.

摘要

鸟枪法DNA测序的经典理论既没有考虑到作为正反测序策略基本结果的位置依赖性,也没有考虑到中小规模基因组靶标所产生的边缘效应。这些现象与许多测序情况相关,包括大插入片段的BAC和fosmid克隆、经过筛选的基因组文库以及大核染色体。在此,我们报告了一个考虑这两种效应的模型,该模型给出了覆盖度的期望值及其方差。与甲基化筛选的玉米数据进行比较表明,该模型相对于经典理论有显著改进。该模型用于分析一系列中小规模基因组靶标的覆盖性能。我们发现读段配对效应和边缘效应以一种复杂的方式相互作用。相对于较长读段,较短读段每单位序列深度具有更好的覆盖度。原则上,末端序列可以根据模板插入长度进行优化;然而,由于任何一组靶标中固有的大小变异,在大多数情况下不太可能实现最佳性能。相反,单链读段表现出与优化后的末端读段大致相同的覆盖属性。尽管连接信息丢失了,但如果靶标主要代表低拷贝序列,单链数据不应给组装带来重大负担。我们还发现,随机测序应在比目前与大型项目相关的冗余度低得多的情况下停止。鉴于焦磷酸测序仪器每个循环产生的海量数据,这一观察结果建议设计方案,将每个运行循环分配给两个或更多项目。这将防止过度测序,并进一步利用焦磷酸测序方法。

相似文献

1
A general coverage theory for shotgun DNA sequencing.一种用于鸟枪法DNA测序的通用覆盖理论。
J Comput Biol. 2006 Jul-Aug;13(6):1177-96. doi: 10.1089/cmb.2006.13.1177.
2
Extension of Lander-Waterman theory for sequencing filtered DNA libraries.用于对过滤后的DNA文库进行测序的兰德-沃特曼理论扩展
BMC Bioinformatics. 2005 Oct 10;6:245. doi: 10.1186/1471-2105-6-245.
3
Deep sampling of the Palomero maize transcriptome by a high throughput strategy of pyrosequencing.通过焦磷酸测序的高通量策略对帕洛梅罗玉米转录组进行深度采样。
BMC Genomics. 2009 Jul 6;10:299. doi: 10.1186/1471-2164-10-299.
4
Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology.使用454生命科学技术对蒺藜苜蓿表达序列标签进行测序。
BMC Genomics. 2006 Oct 24;7:272. doi: 10.1186/1471-2164-7-272.
5
Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome.基因和反转录转座子的差异甲基化有助于玉米基因组的鸟枪法测序。
Nat Genet. 1999 Nov;23(3):305-8. doi: 10.1038/15479.
6
Enrichment of gene-coding sequences in maize by genome filtration.通过基因组过滤富集玉米中的基因编码序列。
Science. 2003 Dec 19;302(5653):2118-20. doi: 10.1126/science.1090047.
7
Optimized Illumina PCR-free library preparation for bacterial whole genome sequencing and analysis of factors influencing de novo assembly.用于细菌全基因组测序的优化Illumina无PCR文库制备及影响从头组装的因素分析
BMC Res Notes. 2016 May 12;9:269. doi: 10.1186/s13104-016-2072-9.
8
Construction of random sheared fosmid library from Chinese cabbage and its use for Brassica rapa genome sequencing project.构建白菜随机剪切黏粒文库及其在芸薹属基因组测序项目中的应用。
J Genet Genomics. 2011 Jan;38(1):47-53. doi: 10.1016/j.jcg.2010.12.002.
9
The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads.亚麻(Linum usitatissimum)从头组装的短 shotgun 序列读取的基因组。
Plant J. 2012 Nov;72(3):461-73. doi: 10.1111/j.1365-313X.2012.05093.x. Epub 2012 Aug 14.
10
SNP discovery by transcriptome pyrosequencing.通过转录组焦磷酸测序进行单核苷酸多态性(SNP)发现
Methods Mol Biol. 2011;729:225-46. doi: 10.1007/978-1-61779-065-2_15.

引用本文的文献

1
Nonpareil 3: Fast Estimation of Metagenomic Coverage and Sequence Diversity.无与伦比3:宏基因组覆盖度和序列多样性的快速估计
mSystems. 2018 Apr 10;3(3). doi: 10.1128/mSystems.00039-18. eCollection 2018 May-Jun.
2
An Enumerative Combinatorics Model for Fragmentation Patterns in RNA Sequencing Provides Insights into Nonuniformity of the Expected Fragment Starting-Point and Coverage Profile.一种用于RNA测序中片段化模式的枚举组合模型为预期片段起始点和覆盖谱的非均匀性提供了见解。
J Comput Biol. 2017 Mar;24(3):200-212. doi: 10.1089/cmb.2016.0096. Epub 2016 Sep 23.
3
Recovering complete and draft population genomes from metagenome datasets.
从宏基因组数据集中恢复完整和草图的种群基因组。
Microbiome. 2016 Mar 8;4:8. doi: 10.1186/s40168-016-0154-5.
4
Microbiome in human health and disease.人类健康与疾病中的微生物组
Sci Prog. 2013;96(Pt 2):153-70. doi: 10.3184/003685013X13683759820813.
5
Coverage theories for metagenomic DNA sequencing based on a generalization of Stevens' theorem.基于史蒂文斯定理推广的宏基因组DNA测序覆盖理论。
J Math Biol. 2013 Nov;67(5):1141-61. doi: 10.1007/s00285-012-0586-x. Epub 2012 Sep 11.
6
Coverage statistics for sequence census methods.序列普查方法的覆盖统计。
BMC Bioinformatics. 2010 Aug 18;11:430. doi: 10.1186/1471-2105-11-430.
7
Occupancy modeling, maximum contig size probabilities and designing metagenomics experiments.占据模型、最大连续长度概率和宏基因组学实验设计。
PLoS One. 2010 Jul 29;5(7):e11652. doi: 10.1371/journal.pone.0011652.
8
Aspects of coverage in medical DNA sequencing.医学DNA测序中的覆盖度方面
BMC Bioinformatics. 2008 May 16;9:239. doi: 10.1186/1471-2105-9-239.
9
Lessons learned from the initial sequencing of the pig genome: comparative analysis of an 8 Mb region of pig chromosome 17.从猪基因组初步测序中获得的经验教训:猪17号染色体8 Mb区域的比较分析。
Genome Biol. 2007;8(8):R168. doi: 10.1186/gb-2007-8-8-r168.