• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

测序深度对人类基因组中编码和非编码转录本组装的影响。

The effects of sequencing depth on the assembly of coding and noncoding transcripts in the human genome.

机构信息

Shenzhen Key Laboratory of Gene Regulation and Systems Biology, Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, 518055, China.

出版信息

BMC Genomics. 2022 Jul 4;23(1):487. doi: 10.1186/s12864-022-08717-z.

DOI:10.1186/s12864-022-08717-z
PMID:35787153
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9251931/
Abstract

Investigating the functions and activities of genes requires proper annotation of the transcribed units. However, transcript assembly efforts have produced a surprisingly large variation in the number of transcripts, and especially so for noncoding transcripts. This heterogeneity in assembled transcript sets might be partially explained by sequencing depth. Here, we used real and simulated short-read sequencing data as well as long-read data to systematically investigate the impact of sequencing depths on the accuracy of assembled transcripts. We assembled and analyzed transcripts from 671 human short-read data sets and four long-read data sets. At the first level, there is a positive correlation between the number of reads and the number of recovered transcripts. However, the effect of the sequencing depth varied based on cell or tissue type, the type of read and the nature and expression levels of the transcripts. The detection of coding transcripts saturated rapidly with both short and long-reads, however, there was no sign of early saturation for noncoding transcripts at any sequencing depth. Increasing long-read sequencing depth specifically benefited transcripts containing transposable elements. Finally, we show how single-cell RNA-seq can be guided by transcripts assembled from bulk long-read samples, and demonstrate that noncoding transcripts are expressed at similar levels to coding transcripts but are expressed in fewer cells. This study highlights the impact of sequencing depth on transcript assembly.

摘要

研究基因的功能和活动需要对转录单位进行适当的注释。然而,转录本组装工作产生了转录本数量惊人的变化,尤其是非编码转录本。组装转录本集中的这种异质性部分可以通过测序深度来解释。在这里,我们使用真实和模拟的短读测序数据以及长读数据系统地研究了测序深度对组装转录本准确性的影响。我们从 671 个人类短读数据集和四个长读数据集组装和分析了转录本。在第一级,读取次数与恢复转录本的数量之间存在正相关。然而,测序深度的影响因细胞或组织类型、读取类型以及转录本的性质和表达水平而异。无论是短读还是长读,编码转录本的检测都迅速达到饱和,但是在任何测序深度下,非编码转录本都没有出现早期饱和的迹象。增加长读测序深度特别有利于包含转座元件的转录本。最后,我们展示了如何根据批量长读样本组装的转录本来指导单细胞 RNA-seq,并证明非编码转录本的表达水平与编码转录本相似,但在较少的细胞中表达。这项研究强调了测序深度对转录本组装的影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7472/9251931/d7ca7fe9910b/12864_2022_8717_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7472/9251931/880f67749b62/12864_2022_8717_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7472/9251931/5f907cfc8412/12864_2022_8717_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7472/9251931/66e4a0801771/12864_2022_8717_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7472/9251931/24daf50dd76f/12864_2022_8717_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7472/9251931/d7ca7fe9910b/12864_2022_8717_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7472/9251931/880f67749b62/12864_2022_8717_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7472/9251931/5f907cfc8412/12864_2022_8717_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7472/9251931/66e4a0801771/12864_2022_8717_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7472/9251931/24daf50dd76f/12864_2022_8717_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7472/9251931/d7ca7fe9910b/12864_2022_8717_Fig5_HTML.jpg

相似文献

1
The effects of sequencing depth on the assembly of coding and noncoding transcripts in the human genome.测序深度对人类基因组中编码和非编码转录本组装的影响。
BMC Genomics. 2022 Jul 4;23(1):487. doi: 10.1186/s12864-022-08717-z.
2
Merging short and stranded long reads improves transcript assembly.短读和单链长读的合并提高了转录本组装。
PLoS Comput Biol. 2023 Oct 26;19(10):e1011576. doi: 10.1371/journal.pcbi.1011576. eCollection 2023 Oct.
3
A Full-Length mRNA Transcriptome Generated From Hybrid-Corrected PacBio Long-Reads Improves the Transcript Annotation and Identifies Thousands of Novel Splice Variants in Atlantic Salmon.通过混合校正的PacBio长读长生成的全长mRNA转录组改善了转录本注释并鉴定了大西洋鲑鱼中数千种新的剪接变体。
Front Genet. 2021 Apr 27;12:656334. doi: 10.3389/fgene.2021.656334. eCollection 2021.
4
Optimizing de novo assembly of short-read RNA-seq data for phylogenomics.优化短读 RNA-seq 数据的从头组装用于系统发生基因组学。
BMC Genomics. 2013 May 14;14:328. doi: 10.1186/1471-2164-14-328.
5
Accurate transcriptome assembly by Nanopore RNA sequencing reveals novel functional transcripts in hepatocellular carcinoma.基于纳米孔 RNA 测序的精确转录组组装揭示了肝癌中的新型功能转录本。
Cancer Sci. 2021 Sep;112(9):3555-3568. doi: 10.1111/cas.15058. Epub 2021 Jul 29.
6
Accurate identification and analysis of human mRNA isoforms using deep long read sequencing.利用深度长读测序准确识别和分析人类 mRNA 异构体。
G3 (Bethesda). 2013 Mar;3(3):387-97. doi: 10.1534/g3.112.004812. Epub 2013 Mar 1.
7
A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing.利用全长异构体测序和短读长测序的从头组装对高度多倍体甘蔗基因组的复杂转录组进行的一项调查。
BMC Genomics. 2017 May 22;18(1):395. doi: 10.1186/s12864-017-3757-8.
8
Enhancing transcriptome expression quantification through accurate assignment of long RNA sequencing reads with TranSigner.通过TranSigner对长RNA测序读数进行准确分配来增强转录组表达定量。
bioRxiv. 2024 Aug 17:2024.04.13.589356. doi: 10.1101/2024.04.13.589356.
9
A comprehensive investigation of metagenome assembly by linked-read sequencing.基于链接读取测序的宏基因组组装综合研究。
Microbiome. 2020 Nov 11;8(1):156. doi: 10.1186/s40168-020-00929-3.
10
PacBio single molecule long-read sequencing provides insight into the complexity and diversity of the Pinctada fucata martensii transcriptome.太平洋生物单分子长读测序技术为了解马氏珠母贝转录组的复杂性和多样性提供了新视角。
BMC Genomics. 2020 Jul 13;21(1):481. doi: 10.1186/s12864-020-06894-3.

引用本文的文献

1
Transcriptome sequencing reveals the evolutionary histories and gene expression evolution in two related Pagurus species.转录组测序揭示了两种近缘寄居蟹物种的进化历史和基因表达演变。
PLoS One. 2025 Aug 20;20(8):e0330170. doi: 10.1371/journal.pone.0330170. eCollection 2025.
2
Transposable element expression and sub-cellular dynamics during hPSC differentiation to endoderm, mesoderm, and ectoderm lineages.人多能干细胞分化为内胚层、中胚层和外胚层谱系过程中的转座元件表达及亚细胞动力学
Nat Commun. 2025 Aug 18;16(1):7670. doi: 10.1038/s41467-025-63080-3.
3
Protocol to distinguish pre-mRNA from mRNA in RNA-protein interaction studies.

本文引用的文献

1
Transposable element sequence fragments incorporated into coding and noncoding transcripts modulate the transcriptome of human pluripotent stem cells.可转座元件序列片段整合到编码和非编码转录本中,调节人类多能干细胞的转录组。
Nucleic Acids Res. 2021 Sep 20;49(16):9132-9153. doi: 10.1093/nar/gkab710.
2
The RNA Atlas expands the catalog of human non-coding RNAs.RNA图谱扩展了人类非编码RNA的目录。
Nat Biotechnol. 2021 Nov;39(11):1453-1465. doi: 10.1038/s41587-021-00936-1. Epub 2021 Jun 17.
3
Identifying transposable element expression dynamics and heterogeneity during development at the single-cell level with a processing pipeline scTE.
RNA-蛋白质相互作用研究中区分前体mRNA与成熟mRNA的实验方案。
STAR Protoc. 2025 Jul 22;6(3):103967. doi: 10.1016/j.xpro.2025.103967.
4
Long-read RNA sequencing enables full-length chimeric transcript annotation of transposable elements in lung adenocarcinoma.长读长RNA测序可实现肺腺癌中转座元件的全长嵌合转录本注释。
BMC Cancer. 2025 Mar 15;25(1):482. doi: 10.1186/s12885-025-13888-5.
5
Sex-limited experimental evolution drives transcriptomic divergence in a hermaphrodite.性限制实验进化导致雌雄同体的转录组差异。
Genome Biol Evol. 2024 Jan 5;16(1). doi: 10.1093/gbe/evad235.
6
Merging short and stranded long reads improves transcript assembly.短读和单链长读的合并提高了转录本组装。
PLoS Comput Biol. 2023 Oct 26;19(10):e1011576. doi: 10.1371/journal.pcbi.1011576. eCollection 2023 Oct.
7
The status of the human gene catalogue.人类基因目录的现状。
ArXiv. 2023 Mar 24:arXiv:2303.13996v1.
8
Flnc: Machine Learning Improves the Identification of Novel Long Noncoding RNAs from Stand-Alone RNA-Seq Data.Flnc:机器学习助力从独立RNA测序数据中鉴定新型长链非编码RNA
Noncoding RNA. 2022 Oct 13;8(5):70. doi: 10.3390/ncrna8050070.
利用 scTE 处理流水线在单细胞水平上鉴定转座元件表达动力学和异质性。
Nat Commun. 2021 Mar 5;12(1):1456. doi: 10.1038/s41467-021-21808-x.
4
Transcript assembly improves expression quantification of transposable elements in single-cell RNA-seq data.转录本组装可提高单细胞 RNA-seq 数据中转座元件的表达定量。
Genome Res. 2021 Jan;31(1):88-100. doi: 10.1101/gr.265173.120. Epub 2020 Dec 21.
5
GENCODE 2021.GENCODE 2021.
Nucleic Acids Res. 2021 Jan 8;49(D1):D916-D923. doi: 10.1093/nar/gkaa1087.
6
The Dynamics, Causes, and Impacts of Mammalian Evolutionary Rates Revealed by the Analyses of Capybara Draft Genome Sequences.通过分析水豚基因组草图序列揭示哺乳动物进化速率的动态变化、原因及其影响。
Genome Biol Evol. 2020 Aug 1;12(8):1444-1458. doi: 10.1093/gbe/evaa157.
7
The Protein-Coding Human Genome: Annotating High-Hanging Fruits.蛋白质编码的人类基因组:注释高挂的果实。
Bioessays. 2019 Nov;41(11):e1900066. doi: 10.1002/bies.201900066. Epub 2019 Sep 23.
8
Computational Methods for Mapping, Assembly and Quantification for Coding and Non-coding Transcripts.编码和非编码转录本的映射、组装及定量分析的计算方法
Comput Struct Biotechnol J. 2019 May 7;17:628-637. doi: 10.1016/j.csbj.2019.04.012. eCollection 2019.
9
CHESS: a new human gene catalog curated from thousands of large-scale RNA sequencing experiments reveals extensive transcriptional noise.CHESS:从数千个大规模 RNA 测序实验中精心挑选的新人类基因目录揭示了广泛的转录噪声。
Genome Biol. 2018 Nov 28;19(1):208. doi: 10.1186/s13059-018-1590-2.
10
GENCODE reference annotation for the human and mouse genomes.GENCODE 人类和小鼠基因组参考注释。
Nucleic Acids Res. 2019 Jan 8;47(D1):D766-D773. doi: 10.1093/nar/gky955.