• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于纳米孔直接RNA测序中嵌合体伪影检测的基因组语言模型。

A Genomic Language Model for Chimera Artifact Detection in Nanopore Direct RNA Sequencing.

作者信息

Li Yangyang, Wang Ting-You, Guo Qingxiang, Ren Yanan, Lu Xiaotong, Cao Qi, Yang Rendong

机构信息

Department of Urology, Northwestern University Feinberg School of Medicine, 303 E Superior St, Chicago, 60611, IL, USA.

Robert H. Lurie Comprehensive Cancer Center, Northwestern University Feinberg School of Medicine, 675 N St Clair St, Chicago, 60611, IL, USA.

出版信息

bioRxiv. 2024 Oct 26:2024.10.23.619929. doi: 10.1101/2024.10.23.619929.

DOI:10.1101/2024.10.23.619929
PMID:39484530
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11526916/
Abstract

Chimera artifacts in nanopore direct RNA sequencing (dRNA-seq) can significantly distort transcriptome analyses, yet their detection and removal remain challenging due to limitations in existing basecalling models. We present DeepChopper, a genomic language model that precisely identifies and removes adapter sequences from base-called dRNA-seq long reads at single-base resolution, operating independently of raw signal or alignment information to effectively eliminate chimeric read artifacts. By removing these artifacts, DeepChopper substantially improves the accuracy of critical downstream analyses, such as transcript annotation and gene fusion detection, thereby enhancing the reliability and utility of nanopore dRNA-seq for transcriptomics research.

摘要

纳米孔直接RNA测序(dRNA-seq)中的嵌合体伪影会严重扭曲转录组分析,但由于现有碱基识别模型的局限性,其检测和去除仍然具有挑战性。我们提出了DeepChopper,这是一种基因组语言模型,它能以单碱基分辨率精确识别并从碱基识别的dRNA-seq长读段中去除接头序列,独立于原始信号或比对信息进行操作,以有效消除嵌合读段伪影。通过去除这些伪影,DeepChopper显著提高了关键下游分析(如转录本注释和基因融合检测)的准确性,从而增强了纳米孔dRNA-seq在转录组学研究中的可靠性和实用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fee1/11526916/bbdc3451c20e/nihpp-2024.10.23.619929v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fee1/11526916/6b160e4d5925/nihpp-2024.10.23.619929v2-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fee1/11526916/8061d7062cad/nihpp-2024.10.23.619929v2-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fee1/11526916/01a554c6f523/nihpp-2024.10.23.619929v2-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fee1/11526916/36f71f209b31/nihpp-2024.10.23.619929v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fee1/11526916/bbdc3451c20e/nihpp-2024.10.23.619929v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fee1/11526916/6b160e4d5925/nihpp-2024.10.23.619929v2-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fee1/11526916/8061d7062cad/nihpp-2024.10.23.619929v2-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fee1/11526916/01a554c6f523/nihpp-2024.10.23.619929v2-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fee1/11526916/36f71f209b31/nihpp-2024.10.23.619929v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fee1/11526916/bbdc3451c20e/nihpp-2024.10.23.619929v2-f0002.jpg

相似文献

1
A Genomic Language Model for Chimera Artifact Detection in Nanopore Direct RNA Sequencing.一种用于纳米孔直接RNA测序中嵌合体伪影检测的基因组语言模型。
bioRxiv. 2024 Oct 26:2024.10.23.619929. doi: 10.1101/2024.10.23.619929.
2
Sequencing accuracy and systematic errors of nanopore direct RNA sequencing.纳米孔直接 RNA 测序的测序准确性和系统误差。
BMC Genomics. 2024 May 28;25(1):528. doi: 10.1186/s12864-024-10440-w.
3
EpiNano: Detection of mA RNA Modifications Using Oxford Nanopore Direct RNA Sequencing.EpiNano:利用牛津纳米孔直接 RNA 测序检测 mA RNA 修饰。
Methods Mol Biol. 2021;2298:31-52. doi: 10.1007/978-1-0716-1374-0_3.
4
LongGF: computational algorithm and software tool for fast and accurate detection of gene fusions by long-read transcriptome sequencing.LongGF:一种通过长读转录组测序快速准确检测基因融合的计算算法和软件工具。
BMC Genomics. 2020 Dec 29;21(Suppl 11):793. doi: 10.1186/s12864-020-07207-4.
5
Identification and comparison of m6A modifications in glioblastoma non-coding RNAs with MeRIP-seq and Nanopore dRNA-seq.利用 MeRIP-seq 和 Nanopore dRNA-seq 鉴定和比较胶质母细胞瘤非编码 RNA 的 m6A 修饰。
Epigenetics. 2023 Dec;18(1):2163365. doi: 10.1080/15592294.2022.2163365. Epub 2023 Jan 3.
6
Using Direct RNA Nanopore Sequencing to Deconvolute Viral Transcriptomes.使用直接 RNA 纳米孔测序技术解析病毒转录组。
Curr Protoc Microbiol. 2020 Jun;57(1):e99. doi: 10.1002/cpmc.99.
7
Causalcall: Nanopore Basecalling Using a Temporal Convolutional Network.因果呼叫:使用时间卷积网络的纳米孔碱基识别
Front Genet. 2020 Jan 20;10:1332. doi: 10.3389/fgene.2019.01332. eCollection 2019.
8
Direct Sequencing of RNA and RNA Modification Identification Using Nanopore.使用纳米孔直接测序 RNA 和鉴定 RNA 修饰。
Methods Mol Biol. 2022;2477:71-77. doi: 10.1007/978-1-0716-2257-5_5.
9
FASTdRNA: a workflow for the analysis of ONT direct RNA sequencing.FASTdRNA:一种用于纳米孔直接RNA测序分析的工作流程。
Bioinform Adv. 2023 Jul 20;3(1):vbad099. doi: 10.1093/bioadv/vbad099. eCollection 2023.
10
Direct long-read RNA sequencing identifies a subset of questionable exitrons likely arising from reverse transcription artifacts.直接长读 RNA 测序鉴定出一组可能源自逆转录伪迹的可疑外显子。
Genome Biol. 2021 Jun 28;22(1):190. doi: 10.1186/s13059-021-02411-1.

本文引用的文献

1
A systematic benchmark of Nanopore long-read RNA sequencing for transcript-level analysis in human cell lines.用于人类细胞系转录本水平分析的纳米孔长读长RNA测序的系统基准测试。
Nat Methods. 2025 Apr;22(4):801-812. doi: 10.1038/s41592-025-02623-4. Epub 2025 Mar 13.
2
Genomic language models: opportunities and challenges.基因组语言模型:机遇与挑战。
Trends Genet. 2025 Apr;41(4):286-302. doi: 10.1016/j.tig.2024.11.013. Epub 2025 Jan 2.
3
Nucleotide Transformer: building and evaluating robust foundation models for human genomics.
核苷酸变换器:构建和评估用于人类基因组学的强大基础模型。
Nat Methods. 2025 Feb;22(2):287-297. doi: 10.1038/s41592-024-02523-z. Epub 2024 Nov 28.
4
Sequence modeling and design from molecular to genome scale with Evo.基于 Evo 在从分子到基因组尺度上进行序列建模和设计。
Science. 2024 Nov 15;386(6723):eado9336. doi: 10.1126/science.ado9336.
5
Interactive visualization of nanopore sequencing signal data with Squigualiser.使用 Squigualiser 对纳米孔测序信号数据进行交互式可视化。
Bioinformatics. 2024 Aug 2;40(8). doi: 10.1093/bioinformatics/btae501.
6
PxBLAT: an efficient python binding library for BLAT.PxBLAT:BLAT 的高效 Python 绑定库。
BMC Bioinformatics. 2024 Jun 19;25(1):219. doi: 10.1186/s12859-024-05844-0.
7
Systematic assessment of long-read RNA-seq methods for transcript identification and quantification.系统评估长读 RNA-seq 方法在转录本鉴定和定量中的应用。
Nat Methods. 2024 Jul;21(7):1349-1363. doi: 10.1038/s41592-024-02298-3. Epub 2024 Jun 7.
8
Sequencing accuracy and systematic errors of nanopore direct RNA sequencing.纳米孔直接 RNA 测序的测序准确性和系统误差。
BMC Genomics. 2024 May 28;25(1):528. doi: 10.1186/s12864-024-10440-w.
9
Accurate isoform discovery with IsoQuant using long reads.利用长读长 IsoQuant 进行准确的异构体发现。
Nat Biotechnol. 2023 Jul;41(7):915-918. doi: 10.1038/s41587-022-01565-y. Epub 2023 Jan 2.
10
Gene Fusion Detection and Characterization in Long-Read Cancer Transcriptome Sequencing Data with FusionSeeker.利用 FusionSeeker 在长读癌症转录组测序数据中检测和描述基因融合。
Cancer Res. 2023 Jan 4;83(1):28-33. doi: 10.1158/0008-5472.CAN-22-1628.