• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Readon:一种利用长读测序数据识别读通转录本的新算法。

Readon: a novel algorithm to identify read-through transcripts with long-read sequencing data.

机构信息

Key Laboratory of Epigenetic Regulation and Intervention, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.

College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China.

出版信息

Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae336.

DOI:10.1093/bioinformatics/btae336
PMID:38808568
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11162696/
Abstract

MOTIVATION

There are many clustered transcriptionally active regions in the human genome, in which the transcription complex cannot immediately terminate transcription at the upstream gene termination site, but instead continues to transcribe intergenic regions and downstream genes, resulting in read-through transcripts. Several studies have demonstrated the regulatory roles of read-through transcripts in tumorigenesis and development. However, limited by the read length of next-generation sequencing, discovery of read-through transcripts has been slow. For long but also erroneous third-generation sequencing data, this study developed a novel minimizer sketch algorithm to accurately and quickly identify read-through transcripts.

RESULTS

Readon initially splits the reference sequence into distinct active regions. It employs a sliding window approach within each region, calculates minimizers, and constructs the specialized structured arrays for query indexing. Following initial alignment anchor screening of candidate read-through transcripts, further confirmation steps are executed. Comparative assessments against existing software reveal Readon's superior performance on both simulated and validated real data. Additionally, two downstream tools are provided: one for predicting whether a read-through transcript is likely to undergo nonsense-mediated decay or encodes a protein, and another for visualizing splicing patterns.

AVAILABILITY AND IMPLEMENTATION

Readon is freely available on GitHub (https://github.com/Bulabula45/Readon).

摘要

动机

人类基因组中有许多转录活跃的簇区,在这些区域中,转录复合物不能立即在上游基因终止位点终止转录,而是继续转录基因间区和下游基因,导致通读转录本。几项研究表明,通读转录本在肿瘤发生和发展中具有调节作用。然而,受下一代测序读长的限制,通读转录本的发现进展缓慢。对于长但也有错误的第三代测序数据,本研究开发了一种新颖的 minimizer sketch 算法,可以准确快速地识别通读转录本。

结果

Readon 最初将参考序列分割成不同的活跃区域。它在每个区域内采用滑动窗口方法,计算 minimizers,并构建专门的结构化数组以进行查询索引。在对候选通读转录本进行初始对齐锚筛选后,执行进一步的确认步骤。与现有软件的比较评估表明,Readon 在模拟和验证的真实数据上都具有更好的性能。此外,还提供了两个下游工具:一个用于预测通读转录本是否可能经历无义介导的衰变或编码蛋白质,另一个用于可视化剪接模式。

可用性和实现

Readon 可在 GitHub(https://github.com/Bulabula45/Readon)上免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3581/11162696/e4981dba2a24/btae336f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3581/11162696/b7fb6cc9fc44/btae336f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3581/11162696/cd52dc3d496d/btae336f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3581/11162696/4d7eb4f35d70/btae336f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3581/11162696/e4981dba2a24/btae336f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3581/11162696/b7fb6cc9fc44/btae336f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3581/11162696/cd52dc3d496d/btae336f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3581/11162696/4d7eb4f35d70/btae336f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3581/11162696/e4981dba2a24/btae336f4.jpg

相似文献

1
Readon: a novel algorithm to identify read-through transcripts with long-read sequencing data.Readon:一种利用长读测序数据识别读通转录本的新算法。
Bioinformatics. 2024 Jun 3;40(6). doi: 10.1093/bioinformatics/btae336.
2
Illuminating the dark side of the human transcriptome with long read transcript sequencing.利用长读转录组测序揭示人类转录组的暗面。
BMC Genomics. 2020 Oct 30;21(1):751. doi: 10.1186/s12864-020-07123-7.
3
LAMSA: fast split read alignment with long approximate matches.LAMSA:快速分裂读取比对算法,具有长近似匹配功能。
Bioinformatics. 2017 Jan 15;33(2):192-201. doi: 10.1093/bioinformatics/btw594. Epub 2016 Sep 25.
4
HiLive: real-time mapping of illumina reads while sequencing.HiLive:测序时对Illumina reads进行实时映射
Bioinformatics. 2017 Mar 15;33(6):917-319. doi: 10.1093/bioinformatics/btw659.
5
SHARAKU: an algorithm for aligning and clustering read mapping profiles of deep sequencing in non-coding RNA processing.SHARAKU:一种用于非编码RNA加工中深度测序读段映射图谱比对和聚类的算法。
Bioinformatics. 2016 Jun 15;32(12):i369-i377. doi: 10.1093/bioinformatics/btw273.
6
LongGF: computational algorithm and software tool for fast and accurate detection of gene fusions by long-read transcriptome sequencing.LongGF:一种通过长读转录组测序快速准确检测基因融合的计算算法和软件工具。
BMC Genomics. 2020 Dec 29;21(Suppl 11):793. doi: 10.1186/s12864-020-07207-4.
7
Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data.读取-分割-运行:一种利用RNA测序数据识别全基因组非经典剪接区域的改进型生物信息学流程。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):503. doi: 10.1186/s12864-016-2896-7.
8
Weighted minimizer sampling improves long read mapping.加权最小化抽样提高长读测序数据的比对。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i111-i118. doi: 10.1093/bioinformatics/btaa435.
9
Sketching and sampling approaches for fast and accurate long read classification.快速准确的长读分类的草图和采样方法。
BMC Bioinformatics. 2022 Oct 31;23(1):452. doi: 10.1186/s12859-022-05014-0.
10
BPP: a sequence-based algorithm for branch point prediction.BPP:一种基于序列的分支点预测算法。
Bioinformatics. 2017 Oct 15;33(20):3166-3172. doi: 10.1093/bioinformatics/btx401.

本文引用的文献

1
Mechanisms of readthrough mitigation reveal principles of GCN1-mediated translational quality control.通读缓解机制揭示了 GCN1 介导的翻译质量控制的原则。
Cell. 2023 Jul 20;186(15):3227-3244.e20. doi: 10.1016/j.cell.2023.05.035. Epub 2023 Jun 19.
2
Benchmarking of Nanopore R10.4 and R9.4.1 flow cells in single-cell whole-genome amplification and whole-genome shotgun sequencing.纳米孔R10.4和R9.4.1流动槽在单细胞全基因组扩增和全基因组鸟枪法测序中的基准测试
Comput Struct Biotechnol J. 2023 Mar 24;21:2352-2364. doi: 10.1016/j.csbj.2023.03.038. eCollection 2023.
3
Recent advances in cancer fusion transcript detection.
癌症融合转录本检测的最新进展。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac519.
4
Genion, an accurate tool to detect gene fusion from long transcriptomics reads.Genion,一种从长转录组reads 中准确检测基因融合的工具。
BMC Genomics. 2022 Feb 14;23(1):129. doi: 10.1186/s12864-022-08339-5.
5
JAFFAL: detecting fusion genes with long-read transcriptome sequencing.JAFFAL:利用长读长转录组测序检测融合基因。
Genome Biol. 2022 Jan 6;23(1):10. doi: 10.1186/s13059-021-02588-5.
6
LongGF: computational algorithm and software tool for fast and accurate detection of gene fusions by long-read transcriptome sequencing.LongGF:一种通过长读转录组测序快速准确检测基因融合的计算算法和软件工具。
BMC Genomics. 2020 Dec 29;21(Suppl 11):793. doi: 10.1186/s12864-020-07207-4.
7
Fusion genes as biomarkers in pediatric cancers: A review of the current state and applicability in diagnostics and personalized therapy.融合基因作为儿童癌症的生物标志物:当前状况及在诊断和个性化治疗中的适用性综述
Cancer Lett. 2021 Feb 28;499:24-38. doi: 10.1016/j.canlet.2020.11.015. Epub 2020 Nov 25.
8
Weighted minimizer sampling improves long read mapping.加权最小化抽样提高长读测序数据的比对。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i111-i118. doi: 10.1093/bioinformatics/btaa435.
9
Fusion-Bloom: fusion detection in assembled transcriptomes.Fusion-Bloom:组装转录组中的融合检测。
Bioinformatics. 2020 Apr 1;36(7):2256-2257. doi: 10.1093/bioinformatics/btz902.
10
Fusion Transcripts of Adjacent Genes: New Insights into the World of Human Complex Transcripts in Cancer.相邻基因融合转录本:癌症中人类复杂转录本世界的新见解。
Int J Mol Sci. 2019 Oct 23;20(21):5252. doi: 10.3390/ijms20215252.