• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

设计深度测序实验:检测结构变异和估计转录本丰度。

Designing deep sequencing experiments: detecting structural variation and estimating transcript abundance.

机构信息

Dept, of Computer Science and Engineering, UC San Diego, La Jolla, CA, USA.

出版信息

BMC Genomics. 2010 Jun 18;11:385. doi: 10.1186/1471-2164-11-385.

DOI:10.1186/1471-2164-11-385
PMID:20565853
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3091630/
Abstract

BACKGROUND

Massively parallel DNA sequencing technologies have enabled the sequencing of several individual human genomes. These technologies are also being used in novel ways for mRNA expression profiling, genome-wide discovery of transcription-factor binding sites, small RNA discovery, etc. The multitude of sequencing platforms, each with their unique characteristics, pose a number of design challenges, regarding the technology to be used and the depth of sequencing required for a particular sequencing application. Here we describe a number of analytical and empirical results to address design questions for two applications: detection of structural variations from paired-end sequencing and estimating mRNA transcript abundance.

RESULTS

For structural variation, our results provide explicit trade-offs between the detection and resolution of rearrangement breakpoints, and the optimal mix of paired-read insert lengths. Specifically, we prove that optimal detection and resolution of breakpoints is achieved using a mix of exactly two insert library lengths. Furthermore, we derive explicit formulae to determine these insert length combinations, enabling a 15% improvement in breakpoint detection at the same experimental cost. On empirical short read data, these predictions show good concordance with Illumina 200 bp and 2 Kbp insert length libraries. For transcriptome sequencing, we determine the sequencing depth needed to detect rare transcripts from a small pilot study. With only 1 Million reads, we derive corrections that enable almost perfect prediction of the underlying expression probability distribution, and use this to predict the sequencing depth required to detect low expressed genes with greater than 95% probability.

CONCLUSIONS

Together, our results form a generic framework for many design considerations related to high-throughput sequencing. We provide software tools http://bix.ucsd.edu/projects/NGS-DesignTools to derive platform independent guidelines for designing sequencing experiments (amount of sequencing, choice of insert length, mix of libraries) for novel applications of next generation sequencing.

摘要

背景

大规模平行 DNA 测序技术已经使得对几个个体人类基因组进行测序成为可能。这些技术也正在以新颖的方式用于 mRNA 表达谱分析、全基因组转录因子结合位点发现、小 RNA 发现等。多种测序平台,每个平台都具有独特的特点,在要使用的技术和特定测序应用所需的测序深度方面都带来了一些设计挑战。在这里,我们描述了一些分析和经验结果,以解决两个应用程序的设计问题:从配对末端测序中检测结构变异和估计 mRNA 转录本丰度。

结果

对于结构变异,我们的结果提供了在重排断点的检测和分辨率以及最佳配对读取插入长度组合之间的明确权衡。具体来说,我们证明使用恰好两种插入文库长度的混合可以实现断点的最佳检测和分辨率。此外,我们推导出了确定这些插入长度组合的显式公式,使在相同实验成本下,断点检测提高了 15%。在经验性短读数据上,这些预测与 Illumina 200bp 和 2Kbp 插入长度文库具有很好的一致性。对于转录组测序,我们从一个小的试点研究中确定了检测稀有转录本所需的测序深度。仅使用 100 万个读数,我们得出了校正值,这些校正值几乎可以完美地预测潜在的表达概率分布,并使用这些值来预测以 95%以上的概率检测低表达基因所需的测序深度。

结论

总的来说,我们的结果为与高通量测序相关的许多设计考虑因素形成了通用框架。我们提供了软件工具 http://bix.ucsd.edu/projects/NGS-DesignTools,为下一代测序的新型应用程序设计测序实验(测序量、插入长度选择、文库混合)提供了平台独立的指导原则。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8876/3091630/ac385aeb2f49/1471-2164-11-385-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8876/3091630/8d5d5fe75897/1471-2164-11-385-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8876/3091630/db0940fe3bca/1471-2164-11-385-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8876/3091630/ee37b40f719a/1471-2164-11-385-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8876/3091630/ac385aeb2f49/1471-2164-11-385-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8876/3091630/8d5d5fe75897/1471-2164-11-385-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8876/3091630/db0940fe3bca/1471-2164-11-385-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8876/3091630/ee37b40f719a/1471-2164-11-385-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8876/3091630/ac385aeb2f49/1471-2164-11-385-4.jpg

相似文献

1
Designing deep sequencing experiments: detecting structural variation and estimating transcript abundance.设计深度测序实验:检测结构变异和估计转录本丰度。
BMC Genomics. 2010 Jun 18;11:385. doi: 10.1186/1471-2164-11-385.
2
Pseudo-Sanger sequencing: massively parallel production of long and near error-free reads using NGS technology.伪桑格测序:使用下一代测序(NGS)技术大规模并行产生长且近乎无错误的 reads。
BMC Genomics. 2013 Oct 17;14(1):711. doi: 10.1186/1471-2164-14-711.
3
Robust and exact structural variation detection with paired-end and soft-clipped alignments: SoftSV compared with eight algorithms.利用双端和软剪切比对进行稳健且精确的结构变异检测:SoftSV与八种算法的比较
Brief Bioinform. 2016 Jan;17(1):51-62. doi: 10.1093/bib/bbv028. Epub 2015 May 20.
4
Optimized Illumina PCR-free library preparation for bacterial whole genome sequencing and analysis of factors influencing de novo assembly.用于细菌全基因组测序的优化Illumina无PCR文库制备及影响从头组装的因素分析
BMC Res Notes. 2016 May 12;9:269. doi: 10.1186/s13104-016-2072-9.
5
Single read and paired end mRNA-Seq Illumina libraries from 10 nanograms total RNA.来自10纳克总RNA的单端和双端mRNA-Seq Illumina文库。
J Vis Exp. 2011 Oct 27(56):e3340. doi: 10.3791/3340.
6
RNA sequencing read depth requirement for optimal transcriptome coverage in Hevea brasiliensis.橡胶树中实现最佳转录组覆盖所需的RNA测序读长深度
BMC Res Notes. 2014 Feb 1;7:69. doi: 10.1186/1756-0500-7-69.
7
Comparison of next generation sequencing technologies for transcriptome characterization.用于转录组特征分析的新一代测序技术比较
BMC Genomics. 2009 Aug 1;10:347. doi: 10.1186/1471-2164-10-347.
8
Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes.用于高通量测序基因组中结构变异检测的组合算法
Genome Res. 2009 Jul;19(7):1270-8. doi: 10.1101/gr.088633.108. Epub 2009 May 15.
9
Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding.通过使用双碱基编码的短读长、大规模平行连接测序揭示的人类基因组中的序列和结构变异。
Genome Res. 2009 Sep;19(9):1527-41. doi: 10.1101/gr.091868.109. Epub 2009 Jun 22.
10
Optimization of enzymatic fragmentation is crucial to maximize genome coverage: a comparison of library preparation methods for Illumina sequencing.优化酶切片段化对于最大化基因组覆盖度至关重要:Illumina 测序文库制备方法的比较。
BMC Genomics. 2022 Feb 1;23(1):92. doi: 10.1186/s12864-022-08316-y.

引用本文的文献

1
SVEngine: an efficient and versatile simulator of genome structural variations with features of cancer clonal evolution.SVEngine:一种高效、通用的基因组结构变异模拟器,具有癌症克隆进化特征。
Gigascience. 2018 Jul 1;7(7). doi: 10.1093/gigascience/giy081.
2
RNA-Seq Revealed Differences in Transcriptomes between 3ADON and 15ADON Populations of Fusarium graminearum In Vitro and In Planta.RNA测序揭示了禾谷镰刀菌3ADON和15ADON群体在体外和植物体内转录组的差异。
PLoS One. 2016 Oct 27;11(10):e0163803. doi: 10.1371/journal.pone.0163803. eCollection 2016.
3
Data quality aware analysis of differential expression in RNA-seq with NOISeq R/Bioc package.

本文引用的文献

1
Limitations and possibilities of small RNA digital gene expression profiling.小RNA数字基因表达谱分析的局限性与可能性
Nat Methods. 2009 Jul;6(7):474-6. doi: 10.1038/nmeth0709-474.
2
Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding.通过使用双碱基编码的短读长、大规模平行连接测序揭示的人类基因组中的序列和结构变异。
Genome Res. 2009 Sep;19(9):1527-41. doi: 10.1101/gr.091868.109. Epub 2009 Jun 22.
3
A geometric approach for classification and comparison of structural variants.
使用NOISeq R/Bioc软件包对RNA测序中的差异表达进行数据质量感知分析。
Nucleic Acids Res. 2015 Dec 2;43(21):e140. doi: 10.1093/nar/gkv711. Epub 2015 Jul 16.
4
RNA-seq Data: Challenges in and Recommendations for Experimental Design and Analysis.RNA测序数据:实验设计与分析中的挑战及建议
Curr Protoc Hum Genet. 2014 Oct 1;83:11.13.1-20. doi: 10.1002/0471142905.hg1113s83.
5
On the power and the systematic biases of the detection of chromosomal inversions by paired-end genome sequencing.关于通过 paired-end 基因组测序检测染色体倒位的能力和系统偏差。
PLoS One. 2013 Apr 23;8(4):e61292. doi: 10.1371/journal.pone.0061292. Print 2013.
6
Sequence assembly demystified.序列组装揭秘。
Nat Rev Genet. 2013 Mar;14(3):157-67. doi: 10.1038/nrg3367. Epub 2013 Jan 29.
7
Efficient experimental design and analysis strategies for the detection of differential expression using RNA-Sequencing.使用 RNA 测序检测差异表达的高效实验设计和分析策略。
BMC Genomics. 2012 Sep 17;13:484. doi: 10.1186/1471-2164-13-484.
8
Systems biology approach predicts antibody signature associated with Brucella melitensis infection in humans.系统生物学方法预测与人类感染布鲁氏菌相关的抗体特征。
J Proteome Res. 2011 Oct 7;10(10):4813-24. doi: 10.1021/pr200619r. Epub 2011 Sep 8.
9
New approaches to Prunus transcriptome analysis.李属转录组分析的新方法。
Genetica. 2011 Jun;139(6):755-69. doi: 10.1007/s10709-011-9580-2. Epub 2011 May 17.
10
Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies.评估使用 Mate-Pairs 解决从头组装的短读 prokaryotic 重复的好处。
BMC Bioinformatics. 2011 Apr 13;12:95. doi: 10.1186/1471-2105-12-95.
一种用于结构变异分类和比较的几何方法。
Bioinformatics. 2009 Jun 15;25(12):i222-30. doi: 10.1093/bioinformatics/btp208.
4
Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes.用于高通量测序基因组中结构变异检测的组合算法
Genome Res. 2009 Jul;19(7):1270-8. doi: 10.1101/gr.088633.108. Epub 2009 May 15.
5
PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data.PEMer:一个基于模拟的错误模型的计算框架,用于从海量的 paired-end 测序数据中推断基因组结构变体。
Genome Biol. 2009 Feb 23;10(2):R23. doi: 10.1186/gb-2009-10-2-r23.
6
Transcriptome sequencing to detect gene fusions in cancer.转录组测序用于检测癌症中的基因融合。
Nature. 2009 Mar 5;458(7234):97-101. doi: 10.1038/nature07638. Epub 2009 Jan 11.
7
De novo fragment assembly with short mate-paired reads: Does the read length matter?利用短配对末端读段进行从头片段组装:读段长度重要吗?
Genome Res. 2009 Feb;19(2):336-46. doi: 10.1101/gr.079053.108. Epub 2008 Dec 3.
8
Real-time DNA sequencing from single polymerase molecules.来自单个聚合酶分子的实时DNA测序。
Science. 2009 Jan 2;323(5910):133-8. doi: 10.1126/science.1162986. Epub 2008 Nov 20.
9
RNA-Seq: a revolutionary tool for transcriptomics.RNA测序:转录组学的革命性工具。
Nat Rev Genet. 2009 Jan;10(1):57-63. doi: 10.1038/nrg2484.
10
Accurate whole human genome sequencing using reversible terminator chemistry.使用可逆终止子化学法进行准确的全人类基因组测序。
Nature. 2008 Nov 6;456(7218):53-9. doi: 10.1038/nature07517.