Suppr超能文献

全长异构体拼接测序解析癌症转录组复杂性。

Full-length isoform concatenation sequencing to resolve cancer transcriptome complexity.

机构信息

The Steve and Cindy Rasmussen Institute for Genomic Medicine, Abigail Wexner Research Institute at Nationwide Children's Hospital, 575 Children's Crossroad, Columbus, OH, 43215, USA.

Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, USA.

出版信息

BMC Genomics. 2024 Jan 29;25(1):122. doi: 10.1186/s12864-024-10021-x.

Abstract

BACKGROUND

Cancers exhibit complex transcriptomes with aberrant splicing that induces isoform-level differential expression compared to non-diseased tissues. Transcriptomic profiling using short-read sequencing has utility in providing a cost-effective approach for evaluating isoform expression, although short-read assembly displays limitations in the accurate inference of full-length transcripts. Long-read RNA sequencing (Iso-Seq), using the Pacific Biosciences (PacBio) platform, can overcome such limitations by providing full-length isoform sequence resolution which requires no read assembly and represents native expressed transcripts. A constraint of the Iso-Seq protocol is due to fewer reads output per instrument run, which, as an example, can consequently affect the detection of lowly expressed transcripts. To address these deficiencies, we developed a concatenation workflow, PacBio Full-Length Isoform Concatemer Sequencing (PB_FLIC-Seq), designed to increase the number of unique, sequenced PacBio long-reads thereby improving overall detection of unique isoforms. In addition, we anticipate that the increase in read depth will help improve the detection of moderate to low-level expressed isoforms.

RESULTS

In sequencing a commercial reference (Spike-In RNA Variants; SIRV) with known isoform complexity we demonstrated a 3.4-fold increase in read output per run and improved SIRV recall when using the PB_FLIC-Seq method compared to the same samples processed with the Iso-Seq protocol. We applied this protocol to a translational cancer case, also demonstrating the utility of the PB_FLIC-Seq method for identifying differential full-length isoform expression in a pediatric diffuse midline glioma compared to its adjacent non-malignant tissue. Our data analysis revealed increased expression of extracellular matrix (ECM) genes within the tumor sample, including an isoform of the Secreted Protein Acidic and Cysteine Rich (SPARC) gene that was expressed 11,676-fold higher than in the adjacent non-malignant tissue. Finally, by using the PB_FLIC-Seq method, we detected several cancer-specific novel isoforms.

CONCLUSION

This work describes a concatenation-based methodology for increasing the number of sequenced full-length isoform reads on the PacBio platform, yielding improved discovery of expressed isoforms. We applied this workflow to profile the transcriptome of a pediatric diffuse midline glioma and adjacent non-malignant tissue. Our findings of cancer-specific novel isoform expression further highlight the importance of long-read sequencing for characterization of complex tumor transcriptomes.

摘要

背景

与非病变组织相比,癌症表现出复杂的转录组,存在异常剪接,导致异构体水平的差异表达。使用短读测序进行转录组分析具有提供评估异构体表达的经济有效的方法的效用,尽管短读组装在准确推断全长转录本方面存在局限性。使用 Pacific Biosciences (PacBio) 平台的长读 RNA 测序 (Iso-Seq) 可以克服这种限制,提供全长异构体序列分辨率,无需读组装,代表天然表达的转录本。Iso-Seq 协议的一个限制是由于每个仪器运行输出的读取次数较少,例如,这可能会影响低表达转录本的检测。为了解决这些缺陷,我们开发了一种拼接工作流程,PacBio 全长异构体拼接测序 (PB_FLIC-Seq),旨在增加独特的、测序的 PacBio 长读长数量,从而提高独特异构体的整体检测。此外,我们预计增加读深度将有助于提高中低水平表达异构体的检测。

结果

在对具有已知异构体复杂性的商业参考品(Spike-In RNA Variants;SIRV)进行测序时,我们证明与使用 Iso-Seq 协议处理相同样本相比,每个运行的读输出增加了 3.4 倍,并且 SIRV 召回率更高使用 PB_FLIC-Seq 方法。我们将该方案应用于一个转化癌症病例,还证明了 PB_FLIC-Seq 方法在识别小儿弥漫性中线神经胶质瘤与其相邻非恶性组织之间的差异全长异构体表达方面的实用性。我们的数据分析显示,肿瘤样本中细胞外基质 (ECM) 基因的表达增加,包括 SPARC 基因的一个异构体,其表达水平比相邻非恶性组织高 11676 倍。最后,通过使用 PB_FLIC-Seq 方法,我们检测到了几种癌症特异性的新异构体。

结论

这项工作描述了一种基于拼接的方法,用于增加 PacBio 平台上全长异构体读取的数量,从而提高表达异构体的发现。我们将该工作流程应用于小儿弥漫性中线神经胶质瘤和相邻非恶性组织的转录组分析。我们发现癌症特异性新异构体的表达进一步强调了长读测序在复杂肿瘤转录组表征中的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/438f/10823626/87576aef63f0/12864_2024_10021_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验