• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

测序深度和技术对 de novo RNA-Seq 组装的影响。

Impact of sequencing depth and technology on de novo RNA-Seq assembly.

机构信息

Department of Medicine, University of Alberta, Edmonton, AB, T6G 2E1, Canada.

Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada.

出版信息

BMC Genomics. 2019 Jul 23;20(1):604. doi: 10.1186/s12864-019-5965-x.

DOI:10.1186/s12864-019-5965-x
PMID:31337347
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6651908/
Abstract

BACKGROUND

RNA-Seq data is inherently nonuniform for different transcripts because of differences in gene expression. This makes it challenging to decide how much data should be generated from each sample. How much should one spend to recover the less expressed transcripts? The sequencing technology used is another consideration, as there are inevitably always biases against certain sequences. To investigate these effects, we first looked at high-depth libraries from a set of well-annotated organisms to ascertain the impact of sequencing depth on de novo assembly. We then looked at libraries sequenced from the Universal Human Reference RNA (UHRR) to compare the performance of Illumina HiSeq and MGI DNBseq™ technologies.

RESULTS

On the issue of sequencing depth, the amount of exomic sequence assembled plateaued using data sets of approximately 2 to 8 Gbp. However, the amount of genomic sequence assembled did not plateau for many of the analyzed organisms. Most of the unannotated genomic sequences are single-exon transcripts whose biological significance will be questionable for some users. On the issue of sequencing technology, both of the analyzed platforms recovered a similar number of full-length transcripts. The missing "gap" regions in the HiSeq assemblies were often attributed to higher GC contents, but this may be an artefact of library preparation and not of sequencing technology.

CONCLUSIONS

Increasing sequencing depth beyond modest data sets of less than 10 Gbp recovers a plethora of single-exon transcripts undocumented in genome annotations. DNBseq™ is a viable alternative to HiSeq for de novo RNA-Seq assembly.

摘要

背景

由于基因表达的差异,不同转录本的 RNA-Seq 数据本质上是不均匀的。这使得很难确定应该从每个样本中生成多少数据。对于表达较少的转录本,应该花费多少来恢复它们?所使用的测序技术也是另一个需要考虑的因素,因为某些序列总是不可避免地存在偏见。为了研究这些影响,我们首先查看了一组经过良好注释的生物体的高深度文库,以确定测序深度对从头组装的影响。然后,我们查看了从通用人类参考 RNA(UHRR)测序的文库,以比较 Illumina HiSeq 和 MGI DNBseq™ 技术的性能。

结果

关于测序深度的问题,使用大约 2 到 8 Gbp 的数据集组装的外显子序列量达到了平台期。然而,对于许多分析的生物体来说,组装的基因组序列量并没有达到平台期。大多数未注释的基因组序列是单外显子转录本,对于一些用户来说,它们的生物学意义将是值得怀疑的。关于测序技术的问题,两种分析平台都恢复了相似数量的全长转录本。HiSeq 组装中缺失的“缺口”区域通常归因于较高的 GC 含量,但这可能是文库制备而不是测序技术的人为产物。

结论

在 10 Gbp 以下的适度数据集之外增加测序深度,会恢复大量在基因组注释中未记录的单外显子转录本。DNBseq™ 是从头 RNA-Seq 组装的 HiSeq 替代方案。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/95959fc3067e/12864_2019_5965_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/81a08269258b/12864_2019_5965_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/c59e09eba886/12864_2019_5965_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/aff2d8c6ee07/12864_2019_5965_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/7307f8d28b63/12864_2019_5965_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/e358e80649d9/12864_2019_5965_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/0969fce65e0a/12864_2019_5965_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/18d43272e5c7/12864_2019_5965_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/95959fc3067e/12864_2019_5965_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/81a08269258b/12864_2019_5965_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/c59e09eba886/12864_2019_5965_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/aff2d8c6ee07/12864_2019_5965_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/7307f8d28b63/12864_2019_5965_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/e358e80649d9/12864_2019_5965_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/0969fce65e0a/12864_2019_5965_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/18d43272e5c7/12864_2019_5965_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d367/6651908/95959fc3067e/12864_2019_5965_Fig8_HTML.jpg

相似文献

1
Impact of sequencing depth and technology on de novo RNA-Seq assembly.测序深度和技术对 de novo RNA-Seq 组装的影响。
BMC Genomics. 2019 Jul 23;20(1):604. doi: 10.1186/s12864-019-5965-x.
2
Efficient assembly and annotation of the transcriptome of catfish by RNA-Seq analysis of a doubled haploid homozygote.通过对双倍单倍体纯合子的 RNA-Seq 分析,高效组装和注释鲶鱼转录组。
BMC Genomics. 2012 Nov 5;13:595. doi: 10.1186/1471-2164-13-595.
3
A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing.利用全长异构体测序和短读长测序的从头组装对高度多倍体甘蔗基因组的复杂转录组进行的一项调查。
BMC Genomics. 2017 May 22;18(1):395. doi: 10.1186/s12864-017-3757-8.
4
Mining Arabidopsis thaliana RNA-seq data with Integrated Genome Browser reveals stress-induced alternative splicing of the putative splicing regulator SR45a.利用 Integrated Genome Browser 挖掘拟南芥 RNA-seq 数据揭示了应激诱导的假定剪接调控因子 SR45a 的可变剪接。
Am J Bot. 2012 Feb;99(2):219-31. doi: 10.3732/ajb.1100355. Epub 2012 Jan 30.
5
Optimizing de novo assembly of short-read RNA-seq data for phylogenomics.优化短读 RNA-seq 数据的从头组装用于系统发生基因组学。
BMC Genomics. 2013 May 14;14:328. doi: 10.1186/1471-2164-14-328.
6
RNA-Seq analysis of soft rush (Juncus effusus): transcriptome sequencing, de novo assembly, annotation, and polymorphism identification.软叶泽兰(Juncus effusus)的 RNA-Seq 分析:转录组测序、从头组装、注释和多态性鉴定。
BMC Genomics. 2019 Jun 13;20(1):489. doi: 10.1186/s12864-019-5886-8.
7
A high-quality annotated transcriptome of swine peripheral blood.猪外周血的高质量注释转录组。
BMC Genomics. 2017 Jun 24;18(1):479. doi: 10.1186/s12864-017-3863-7.
8
Massive parallel sequencing of mRNA in identification of unannotated salinity stress-inducible transcripts in rice (Oryza sativa L.).大规模平行测序 mRNA 鉴定水稻(Oryza sativa L.)中未注释的盐胁迫诱导转录本。
BMC Genomics. 2010 Dec 2;11:683. doi: 10.1186/1471-2164-11-683.
9
De novo reconstruction of the Toxoplasma gondii transcriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs.新生重建弓形虫转录组提高了目前的基因组注释,并揭示了选择性剪接的转录本和潜在的长非编码 RNA。
BMC Genomics. 2012 Dec 12;13:696. doi: 10.1186/1471-2164-13-696.
10
How deep is deep enough for RNA-Seq profiling of bacterial transcriptomes?RNA-Seq 分析细菌转录组的深度要多深?
BMC Genomics. 2012 Dec 27;13:734. doi: 10.1186/1471-2164-13-734.

引用本文的文献

1
Transcriptome analysis of peach seedlings () experiencing drought stress.遭受干旱胁迫的桃树苗()的转录组分析。
Sci Prog. 2025 Jul-Sep;108(3):368504251358640. doi: 10.1177/00368504251358640. Epub 2025 Aug 3.
2
Genomic and chromosomal architectures underlying fertility maintenance in the testes of intergeneric homoploid hybrids.属间同倍体杂种睾丸中维持生育力的基因组和染色体结构。
Sci China Life Sci. 2025 May 23. doi: 10.1007/s11427-024-2868-y.
3
Iron Deficiency Impairs Dendritic Cell Development and Function, Compromising Host Anti-Infection Capacity.

本文引用的文献

1
Reliable multiplex sequencing with rare index mis-assignment on DNB-based NGS platform.基于 DNB 的 NGS 平台上可靠的多重测序,罕见索引错配。
BMC Genomics. 2019 Mar 13;20(1):215. doi: 10.1186/s12864-019-5569-5.
2
Comparative performance of the BGISEQ-500 and Illumina HiSeq4000 sequencing platforms for transcriptome analysis in plants.BGISEQ-500和Illumina HiSeq4000测序平台在植物转录组分析中的性能比较
Plant Methods. 2018 Aug 13;14:69. doi: 10.1186/s13007-018-0337-0. eCollection 2018.
3
Characterization and remediation of sample index swaps by non-redundant dual indexing on massively parallel sequencing platforms.
缺铁会损害树突状细胞的发育和功能,削弱宿主抗感染能力。
Adv Sci (Weinh). 2025 May;12(20):e2408348. doi: 10.1002/advs.202408348. Epub 2025 Apr 30.
4
DNA metabarcoding and its potential in microbial risk assessment in waste sorting plants.DNA宏条形码技术及其在垃圾分类厂微生物风险评估中的潜力。
Sci Rep. 2025 Mar 15;15(1):8941. doi: 10.1038/s41598-025-93697-9.
5
Genomes reveal pervasive distant hybridization in nature among cyprinid fishes.基因组揭示了鲤科鱼类在自然界中普遍存在的远缘杂交现象。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giae117.
6
Lung megakaryocytes engulf inhaled airborne particles to promote intrapulmonary inflammation and extrapulmonary distribution.肺巨核细胞吞噬吸入的空气传播颗粒,以促进肺内炎症和肺外分布。
Nat Commun. 2024 Aug 27;15(1):7396. doi: 10.1038/s41467-024-51686-y.
7
Variation and Interaction of Distinct Subgenomes Contribute to Growth Diversity in Intergeneric Hybrid Fish.不同亚基因组的变异与相互作用促成了属间杂交鱼的生长多样性。
Genomics Proteomics Bioinformatics. 2025 Jan 15;22(6). doi: 10.1093/gpbjnl/qzae055.
8
Genome-Wide Transcriptome Profiling Reveals the Mechanisms Underlying Hepatic Metabolism under Different Raising Systems in Yak.全基因组转录组分析揭示牦牛不同饲养体系下肝脏代谢的潜在机制
Animals (Basel). 2024 Feb 23;14(5):695. doi: 10.3390/ani14050695.
9
Comparative transcriptomic analysis of Illumina and MGI next-generation sequencing platforms using RUNX3- and ZBTB46-instructed embryonic stem cells.使用RUNX3和ZBTB46指导的胚胎干细胞对Illumina和MGI下一代测序平台进行的比较转录组分析。
Front Genet. 2024 Jan 5;14:1275383. doi: 10.3389/fgene.2023.1275383. eCollection 2023.
10
Comparison of the DNBSEQ platform and Illumina HiSeq 2000 for bacterial genome assembly.比较 DNBSEQ 平台和 Illumina HiSeq 2000 用于细菌基因组组装。
Sci Rep. 2024 Jan 14;14(1):1292. doi: 10.1038/s41598-024-51725-0.
基于大规模平行测序平台的非冗余双索引对样本索引交换的特征描述和修复。
BMC Genomics. 2018 May 8;19(1):332. doi: 10.1186/s12864-018-4703-0.
4
Germline and somatic variant identification using BGISEQ-500 and HiSeq X Ten whole genome sequencing.使用BGISEQ-500和HiSeq X Ten全基因组测序进行种系和体细胞变异鉴定。
PLoS One. 2018 Jan 10;13(1):e0190264. doi: 10.1371/journal.pone.0190264. eCollection 2018.
5
Assessment of the cPAS-based BGISEQ-500 platform for metagenomic sequencing.基于 cPAS 的 BGISEQ-500 平台用于宏基因组测序的评估。
Gigascience. 2018 Mar 1;7(3):1-8. doi: 10.1093/gigascience/gix133.
6
SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data.SOAPnuke:一种基于 MapReduce 加速的高通量测序数据集成质量控制和预处理软件。
Gigascience. 2018 Jan 1;7(1):1-6. doi: 10.1093/gigascience/gix120.
7
Comparative performance of the BGISEQ-500 vs Illumina HiSeq2500 sequencing platforms for palaeogenomic sequencing.BGISEQ-500与Illumina HiSeq2500测序平台在古基因组测序方面的比较性能
Gigascience. 2017 Aug 1;6(8):1-13. doi: 10.1093/gigascience/gix049.
8
A reference human genome dataset of the BGISEQ-500 sequencer.BGISEQ-500测序仪的参考人类基因组数据集。
Gigascience. 2017 May 1;6(5):1-9. doi: 10.1093/gigascience/gix024.
9
cPAS-based sequencing on the BGISEQ-500 to explore small non-coding RNAs.基于cPAS在BGISEQ-500上进行测序以探索小非编码RNA。
Clin Epigenetics. 2016 Nov 21;8:123. doi: 10.1186/s13148-016-0287-1. eCollection 2016.
10
Near-optimal probabilistic RNA-seq quantification.近乎最优的概率 RNA-seq 定量。
Nat Biotechnol. 2016 May;34(5):525-7. doi: 10.1038/nbt.3519. Epub 2016 Apr 4.