• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

检测长读段中的回环伪影。

Detecting Foldback Artifacts in Long Reads.

作者信息

Heinz Jakob M, Meyerson Matthew, Li Heng

机构信息

Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States.

Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, United States.

出版信息

bioRxiv. 2025 Jul 18:2025.07.15.664946. doi: 10.1101/2025.07.15.664946.

DOI:10.1101/2025.07.15.664946
PMID:40791372
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12338684/
Abstract

Long-read sequencing data is useful for detecting large and complex structural variations; however, technical artifacts can lead to false structural variant calls. In our analyses, we became aware of a foldback artifact in long-read data. Therefore, we developed the open-source Breakinator tool to flag putative foldback artifact reads, as well as previously known chimeric artifacts. Through an alignment-based approach, Breakinator can detect artifacts missed by existing quality control tools. We profiled the occurrences of foldbacks and chimeric reads in both nanopore and single-molecule real-time sequences across a range of specimens, library types, sequencing chemistries, sequencing machines, and base-calling software.

摘要

长读长测序数据对于检测大型和复杂的结构变异很有用;然而,技术假象可能导致错误的结构变异调用。在我们的分析中,我们意识到长读长数据中存在一种回文假象。因此,我们开发了开源的Breakinator工具,以标记假定的回文假象 reads,以及先前已知的嵌合假象。通过基于比对的方法,Breakinator可以检测出现有质量控制工具遗漏的假象。我们分析了一系列样本、文库类型、测序化学方法、测序机器和碱基识别软件中纳米孔和单分子实时序列中的回文和嵌合 reads 的出现情况。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af8d/12478012/84c43dfcfe10/nihpp-2025.07.15.664946v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af8d/12478012/803b21f7470a/nihpp-2025.07.15.664946v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af8d/12478012/84c43dfcfe10/nihpp-2025.07.15.664946v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af8d/12478012/803b21f7470a/nihpp-2025.07.15.664946v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af8d/12478012/84c43dfcfe10/nihpp-2025.07.15.664946v2-f0002.jpg

相似文献

1
Detecting Foldback Artifacts in Long Reads.检测长读段中的回环伪影。
bioRxiv. 2025 Jul 18:2025.07.15.664946. doi: 10.1101/2025.07.15.664946.
2
SAKit: An all-in-one analysis pipeline for identifying novel proteins resulting from variant events at both large and small scales.SAKit:一种用于鉴定由大尺度和小尺度变异事件产生的新型蛋白质的一体化分析管道。
J Bioinform Comput Biol. 2024 Oct;22(5):2450022. doi: 10.1142/S0219720024500227. Epub 2024 Oct 1.
3
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
4
A personalized multi-platform assessment of somatic mosaicism in the human frontal cortex.人类额叶皮质体细胞镶嵌现象的个性化多平台评估
bioRxiv. 2024 Dec 21:2024.12.18.629274. doi: 10.1101/2024.12.18.629274.
5
SQANTI: extensive characterization of long-read transcript sequences for quality control in full-length transcriptome identification and quantification.SQANTI:用于全长转录组鉴定和定量的长读转录序列的广泛特征化,以进行质量控制。
Genome Res. 2018 Mar 1;28(3):396-411. doi: 10.1101/gr.222976.117.
6
Oarfish: Enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification.皇带鱼:增强的概率模型可提高长读长转录组定量的准确性。
bioRxiv. 2024 Mar 1:2024.02.28.582591. doi: 10.1101/2024.02.28.582591.
7
Oarfish: enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification.皇带鱼:增强的概率模型可提高长读长转录组定量的准确性。
Bioinformatics. 2025 Jul 1;41(Supplement_1):i304-i313. doi: 10.1093/bioinformatics/btaf240.
8
METAnnotatorX2: a Comprehensive Tool for Deep and Shallow Metagenomic Data Set Analyses.METAnnotatorX2:用于深度和浅层宏基因组数据集分析的综合工具。
mSystems. 2021 Jun 29;6(3):e0058321. doi: 10.1128/mSystems.00583-21.
9
cONcat: Computational reconstruction of concatenated fragments from long Oxford Nanopore reads.cONcat:从长牛津纳米孔测序读段中进行串联片段的计算重建。
PLoS One. 2025 Jul 24;20(7):e0321246. doi: 10.1371/journal.pone.0321246. eCollection 2025.
10
Aryana-bs: context-aware alignment of bisulfite-sequencing reads.Aryana-bs:亚硫酸氢盐测序读数的上下文感知比对
BMC Bioinformatics. 2025 Jul 21;26(1):188. doi: 10.1186/s12859-025-06182-5.

本文引用的文献

1
A systematic benchmark of Nanopore long-read RNA sequencing for transcript-level analysis in human cell lines.用于人类细胞系转录本水平分析的纳米孔长读长RNA测序的系统基准测试。
Nat Methods. 2025 Apr;22(4):801-812. doi: 10.1038/s41592-025-02623-4. Epub 2025 Mar 13.
2
Breaking free from references: a consensus-based approach for community profiling with long amplicon nanopore data.摆脱参考文献:一种基于共识的长扩增子纳米孔数据群落分析方法。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae642.
3
Ultrafast one-pass FASTQ data preprocessing, quality control, and deduplication using fastp.
使用fastp进行超快速单通道FASTQ数据预处理、质量控制和重复数据删除。
Imeta. 2023 May 8;2(2):e107. doi: 10.1002/imt2.107. eCollection 2023 May.
4
Detection of mosaic and population-level structural variants with Sniffles2.使用 Sniffles2 检测嵌合体和群体水平的结构变异。
Nat Biotechnol. 2024 Oct;42(10):1571-1580. doi: 10.1038/s41587-023-02024-y. Epub 2024 Jan 2.
5
Restrander: rapid orientation and artefact removal for long-read cDNA data.Restrander:用于长读长cDNA数据的快速定向和伪影去除
NAR Genom Bioinform. 2023 Dec 23;5(4):lqad108. doi: 10.1093/nargab/lqad108. eCollection 2023 Dec.
6
The complete sequence of a human Y chromosome.人类 Y 染色体的完整序列。
Nature. 2023 Sep;621(7978):344-354. doi: 10.1038/s41586-023-06457-y. Epub 2023 Aug 23.
7
Integrated analysis of genomic and transcriptomic data for the discovery of splice-associated variants in cancer.癌症中剪接相关变异的发现:基因组和转录组数据的综合分析。
Nat Commun. 2023 Mar 22;14(1):1589. doi: 10.1038/s41467-023-37266-6.
8
Characterization and simulation of metagenomic nanopore sequencing data with Meta-NanoSim.利用 Meta-NanoSim 对宏基因组纳米孔测序数据进行特征描述和模拟。
Gigascience. 2023 Mar 20;12. doi: 10.1093/gigascience/giad013.
9
Telomere-to-telomere assembly of diploid chromosomes with Verkko.利用 Verkko 进行二倍体染色体的端粒到端粒组装。
Nat Biotechnol. 2023 Oct;41(10):1474-1482. doi: 10.1038/s41587-023-01662-6. Epub 2023 Feb 16.
10
A joint NCBI and EMBL-EBI transcript set for clinical genomics and research.临床基因组学和研究用的 NCBI 和 EMBL-EBI 联合转录本集。
Nature. 2022 Apr;604(7905):310-315. doi: 10.1038/s41586-022-04558-8. Epub 2022 Apr 6.