• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用下一代测序数据鉴定最佳测序深度,特别是对于从头组装小基因组的应用。

Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.

机构信息

Persistent LABS, Persistent Systems Ltd., Pune, Maharashtra, India.

出版信息

PLoS One. 2013 Apr 12;8(4):e60204. doi: 10.1371/journal.pone.0060204. Print 2013.

DOI:10.1371/journal.pone.0060204
PMID:23593174
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3625192/
Abstract

Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6-40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources.

摘要

下一代测序(NGS)是一种颠覆性技术,在生命科学研究界得到了广泛认可。测序的高通量和低成本促使研究人员开展雄心勃勃的基因组项目,尤其是从头测序。目前,NGS 系统生成短读长序列数据,使用这些短读长进行从头基因组组装在计算上非常密集。由于测序成本降低和通量提高,NGS 系统现在能够以高深度对基因组进行测序。然而,目前尚无报告强调使用真实数据集和多种组装算法对高测序深度对基因组组装的影响。最近,一些研究评估了使用多种组装算法时序列覆盖率、错误率和平均读长对基因组组装的影响,但是,这些评估是使用模拟数据集进行的。使用模拟数据集的一个限制是,已知会影响基因组组装的变量,如错误率、读长和覆盖度,都被精心控制。因此,本研究旨在使用基于图的组装算法和真实数据集,确定不同大小基因组进行从头组装所需的最小测序深度。使用 SOAPdenovo、Velvet、ABySS、Meraculous 和 IDBA-UD 对大肠杆菌(4.6MB)、S.kudriavzevii(11.18MB)和 C.elegans(100MB)的 Illumina 读长进行了组装。我们的分析表明,除了需要 100X 读长深度的 Meraculous 之外,所有组装器都需要 50X 的最佳读深来组装这些基因组。此外,我们的分析表明,从头组装 50X 读长数据仅需要 6-40GB 的 RAM,具体取决于基因组大小和使用的组装算法。我们相信,对于设计实验和多路复用的研究人员来说,这些信息非常有价值,这将使测序以及分析资源得到最佳利用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/553f/3625192/2b3321236cab/pone.0060204.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/553f/3625192/812154b0338e/pone.0060204.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/553f/3625192/2b3321236cab/pone.0060204.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/553f/3625192/812154b0338e/pone.0060204.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/553f/3625192/2b3321236cab/pone.0060204.g002.jpg

相似文献

1
Identification of optimum sequencing depth especially for de novo genome assembly of small genomes using next generation sequencing data.利用下一代测序数据鉴定最佳测序深度,特别是对于从头组装小基因组的应用。
PLoS One. 2013 Apr 12;8(4):e60204. doi: 10.1371/journal.pone.0060204. Print 2013.
2
Evaluating long-read de novo assembly tools for eukaryotic genomes: insights and considerations.评估真核生物基因组的长读长从头组装工具:见解与考虑。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad100. Epub 2023 Nov 24.
3
Evaluation of nine popular de novo assemblers in microbial genome assembly.九种常用的从头组装程序在微生物基因组组装中的评估
J Microbiol Methods. 2017 Dec;143:32-37. doi: 10.1016/j.mimet.2017.09.008. Epub 2017 Sep 19.
4
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致(OLC)方法的最佳性能。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.
5
IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth.IDBA-UD:一个用于具有高度不均匀深度的单细胞和宏基因组测序数据的从头组装程序。
Bioinformatics. 2012 Jun 1;28(11):1420-8. doi: 10.1093/bioinformatics/bts174. Epub 2012 Apr 11.
6
Software for pre-processing Illumina next-generation sequencing short read sequences.用于预处理Illumina下一代测序短读序列的软件。
Source Code Biol Med. 2014 May 3;9:8. doi: 10.1186/1751-0473-9-8. eCollection 2014.
7
Subset selection of high-depth next generation sequencing reads for de novo genome assembly using MapReduce framework.使用MapReduce框架进行从头基因组组装时对高深度下一代测序读数的子集选择。
BMC Genomics. 2015;16 Suppl 12(Suppl 12):S9. doi: 10.1186/1471-2164-16-S12-S9. Epub 2015 Dec 9.
8
Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations.病毒宏基因组组装中的碎片化和覆盖度变化,及其对多样性计算的影响。
Front Bioeng Biotechnol. 2015 Sep 17;3:141. doi: 10.3389/fbioe.2015.00141. eCollection 2015.
9
SMRT sequencing only de novo assembly of the sugar beet (Beta vulgaris) chloroplast genome.甜菜(Beta vulgaris)叶绿体基因组的单分子实时测序从头组装
BMC Bioinformatics. 2015 Sep 16;16(1):295. doi: 10.1186/s12859-015-0726-6.
10
Illumina error correction near highly repetitive DNA regions improves de novo genome assembly.Illumina 纠错技术在高度重复 DNA 区域的应用提高了从头基因组组装的质量。
BMC Bioinformatics. 2019 Jun 3;20(1):298. doi: 10.1186/s12859-019-2906-2.

引用本文的文献

1
Clinical evaluation of target capture sequencing technique for noninvasive prenatal diagnosis of β-thalassemia: A prospective case series.靶向捕获测序技术用于β地中海贫血无创产前诊断的临床评估:一项前瞻性病例系列研究。
Medicine (Baltimore). 2025 Aug 22;104(34):e44014. doi: 10.1097/MD.0000000000044014.
2
RSYD-BASIC: a bioinformatic pipeline for routine sequence analysis and data processing of bacterial isolates for clinical microbiology.RSYD-BASIC:一种用于临床微生物学细菌分离株常规序列分析和数据处理的生物信息学流程。
Access Microbiol. 2025 Mar 21;7(3). doi: 10.1099/acmi.0.000646.v6. eCollection 2025.
3
Advances in Whole Genome Sequencing: Methods, Tools, and Applications in Population Genomics.

本文引用的文献

1
Advanced methylome analysis after bisulfite deep sequencing: an example in Arabidopsis.亚硫酸氢盐深度测序后的高级甲基组分析:以拟南芥为例。
PLoS One. 2012;7(7):e41528. doi: 10.1371/journal.pone.0041528. Epub 2012 Jul 20.
2
Genome-wide DNA methylation analyses in the brain reveal four differentially methylated regions between humans and non-human primates.全基因组 DNA 甲基化分析在大脑中揭示了人类和非人类灵长类动物之间的四个差异甲基化区域。
BMC Evol Biol. 2012 Aug 16;12:144. doi: 10.1186/1471-2148-12-144.
3
Exome capture reveals ZNF423 and CEP164 mutations, linking renal ciliopathies to DNA damage response signaling.
全基因组测序进展:群体基因组学中的方法、工具及应用
Int J Mol Sci. 2025 Jan 4;26(1):372. doi: 10.3390/ijms26010372.
4
The Development of a Fluorescent Microsatellite Marker Assay for the Pitaya Canker Pathogen ().火龙果溃疡病菌荧光微卫星标记检测方法的建立()。
Genes (Basel). 2024 Jul 5;15(7):885. doi: 10.3390/genes15070885.
5
Whole genome sequence and comparative genomic analysis of novel Rickettsia koreansis strain CNH17-7 isolated from human.从人类中分离出的新型恙虫东方体 CNH17-7 株的全基因组序列与比较基因组分析。
Eur J Clin Microbiol Infect Dis. 2024 Oct;43(10):1909-1918. doi: 10.1007/s10096-024-04876-x. Epub 2024 Jul 20.
6
An in situ digital synthesis strategy for the discovery and description of ocean life.原位数字化合成策略在海洋生物的发现和描述中的应用。
Sci Adv. 2024 Jan 19;10(3):eadj4960. doi: 10.1126/sciadv.adj4960. Epub 2024 Jan 17.
7
An overlooked phenomenon: complex interactions of potential error sources on the quality of bacterial de novo genome assemblies.一个被忽视的现象:潜在误差源对细菌从头基因组组装质量的复杂相互作用。
BMC Genomics. 2024 Jan 9;25(1):45. doi: 10.1186/s12864-023-09910-4.
8
Identification and high-throughput genotyping of single nucleotide polymorphism markers in a non-model conifer (Abies nordmanniana (Steven) Spach).鉴定和高通量基因分型非模式针叶树(挪威云杉(Steven)Spach)中的单核苷酸多态性标记。
Sci Rep. 2023 Dec 15;13(1):22488. doi: 10.1038/s41598-023-49462-x.
9
Terabase-Scale Coassembly of a Tropical Soil Microbiome.太字节级热带土壤微生物组的共组装。
Microbiol Spectr. 2023 Aug 17;11(4):e0020023. doi: 10.1128/spectrum.00200-23. Epub 2023 Jun 13.
10
Capturing variation in metagenomic assembly graphs with MetaCortex.使用 MetaCortex 捕获宏基因组组装图中的变异。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btad020.
外显子组捕获揭示 ZNF423 和 CEP164 突变,将肾脏纤毛病与 DNA 损伤反应信号联系起来。
Cell. 2012 Aug 3;150(3):533-48. doi: 10.1016/j.cell.2012.06.028.
4
A core human microbiome as viewed through 16S rRNA sequence clusters.通过 16S rRNA 序列聚类观察到的核心人类微生物组。
PLoS One. 2012;7(6):e34242. doi: 10.1371/journal.pone.0034242. Epub 2012 Jun 13.
5
Structure, function and diversity of the healthy human microbiome.健康人体微生物组的结构、功能与多样性。
Nature. 2012 Jun 13;486(7402):207-14. doi: 10.1038/nature11234.
6
IDBA-UD: a de novo assembler for single-cell and metagenomic sequencing data with highly uneven depth.IDBA-UD:一个用于具有高度不均匀深度的单细胞和宏基因组测序数据的从头组装程序。
Bioinformatics. 2012 Jun 1;28(11):1420-8. doi: 10.1093/bioinformatics/bts174. Epub 2012 Apr 11.
7
The Awesome Power of Yeast Evolutionary Genetics: New Genome Sequences and Strain Resources for the Saccharomyces sensu stricto Genus.酵母进化遗传学的强大威力:酿酒酵母属的新基因组序列和菌株资源。
G3 (Bethesda). 2011 Jun;1(1):11-25. doi: 10.1534/g3.111.000273. Epub 2011 Jun 1.
8
Global analysis of DNA methylation by Methyl-Capture sequencing reveals epigenetic control of cisplatin resistance in ovarian cancer cell.通过 Methyl-Capture 测序进行的全基因组 DNA 甲基化分析揭示了顺铂耐药性在卵巢癌细胞中的表观遗传调控。
PLoS One. 2011;6(12):e29450. doi: 10.1371/journal.pone.0029450. Epub 2011 Dec 22.
9
GAGE: A critical evaluation of genome assemblies and assembly algorithms.盖奇:基因组组装和算法的关键评估。
Genome Res. 2012 Mar;22(3):557-67. doi: 10.1101/gr.131383.111. Epub 2012 Jan 6.
10
Evaluation of methods for de novo genome assembly from high-throughput sequencing reads reveals dependencies that affect the quality of the results.评估从头测序读取进行基因组组装的方法揭示了影响结果质量的依赖关系。
PLoS One. 2011;6(9):e24182. doi: 10.1371/journal.pone.0024182. Epub 2011 Sep 7.