• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MATCHCLIP:通过匹配软剪辑读取,使用 CIGAR 字符串定位拷贝数变异的精确断点。

MATCHCLIP: locate precise breakpoints for copy number variation using CIGAR string by matching soft clipped reads.

机构信息

Department of Biostatistics and Epidemiology, University of Pennsylvania Perelman School of Medicine Philadelphia, PA, USA.

出版信息

Front Genet. 2013 Aug 16;4:157. doi: 10.3389/fgene.2013.00157. eCollection 2013.

DOI:10.3389/fgene.2013.00157
PMID:23967014
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3744852/
Abstract

Copy number variations (CNVs) are associated with many complex diseases. Next generation sequencing data enable one to identify precise CNV breakpoints to better under the underlying molecular mechanisms and to design more efficient assays. Using the CIGAR strings of the reads, we develop a method that can identify the exact CNV breakpoints, and in cases when the breakpoints are in a repeated region, the method reports a range where the breakpoints can slide. Our method identifies the breakpoints of a CNV using both the positions and CIGAR strings of the reads that cover breakpoints of a CNV. A read with a long soft clipped part (denoted as S in CIGAR) at its 3'(right) end can be used to identify the 5'(left)-side of the breakpoints, and a read with a long S part at the 5' end can be used to identify the breakpoint at the 3'-side. To ensure both types of reads cover the same CNV, we require the overlapped common string to include both of the soft clipped parts. When a CNV starts and ends in the same repeated regions, its breakpoints are not unique, in which case our method reports the left most positions for the breakpoints and a range within which the breakpoints can be incremented without changing the variant sequence. We have implemented the methods in a C++ package intended for the current Illumina Miseq and Hiseq platforms for both whole genome and exon-sequencing. Our simulation studies have shown that our method compares favorably with other similar methods in terms of true discovery rate, false positive rate and breakpoint accuracy. Our results from a real application have shown that the detected CNVs are consistent with zygosity and read depth information. The software package is available at http://statgene.med.upenn.edu/softprog.html.

摘要

拷贝数变异 (CNVs) 与许多复杂疾病有关。下一代测序数据可用于识别精确的 CNV 断点,以更好地了解潜在的分子机制,并设计更有效的检测方法。我们利用测序reads 的 CIGAR 字符串开发了一种方法,可以识别确切的 CNV 断点,并且在断点位于重复区域的情况下,该方法报告了断点可以滑动的范围。我们的方法使用覆盖 CNV 断点的 reads 的位置和 CIGAR 字符串来识别 CNV 的断点。在其 3'(右)末端具有长软剪接部分(在 CIGAR 中表示为 S)的 read 可用于识别断点的 5'(左)侧,并且在 5' 末端具有长 S 部分的 read 可用于识别 3'侧的断点。为了确保这两种类型的 reads 都覆盖相同的 CNV,我们要求重叠的公共字符串包含两个软剪接部分。当 CNV 在相同的重复区域中开始和结束时,其断点不是唯一的,在这种情况下,我们的方法报告断点的最左侧位置以及可以在不改变变异序列的情况下递增断点的范围内。我们已经在一个 C++ 包中实现了这些方法,该包针对当前的 Illumina Miseq 和 Hiseq 平台,用于全基因组和外显子测序。我们的模拟研究表明,在真发现率、假阳性率和断点准确性方面,我们的方法与其他类似方法相比具有优势。我们在真实应用中的结果表明,检测到的 CNVs 与同卵性和读深度信息一致。软件包可在 http://statgene.med.upenn.edu/softprog.html 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f530/3744852/276e528493b5/fgene-04-00157-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f530/3744852/276e528493b5/fgene-04-00157-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f530/3744852/276e528493b5/fgene-04-00157-g0001.jpg

相似文献

1
MATCHCLIP: locate precise breakpoints for copy number variation using CIGAR string by matching soft clipped reads.MATCHCLIP:通过匹配软剪辑读取,使用 CIGAR 字符串定位拷贝数变异的精确断点。
Front Genet. 2013 Aug 16;4:157. doi: 10.3389/fgene.2013.00157. eCollection 2013.
2
SRBreak: A Read-Depth and Split-Read Framework to Identify Breakpoints of Different Events Inside Simple Copy-Number Variable Regions.SRBreak:一种用于识别简单拷贝数可变区域内不同事件断点的读深度和拆分读框架。
Front Genet. 2016 Sep 15;7:160. doi: 10.3389/fgene.2016.00160. eCollection 2016.
3
Noise cancellation using total variation for copy number variation detection.利用全变差降噪进行拷贝数变异检测。
BMC Bioinformatics. 2018 Oct 22;19(Suppl 11):361. doi: 10.1186/s12859-018-2332-x.
4
CNV-PCC: An efficient method for detecting copy number variations from next-generation sequencing data.CNV-PCC:一种从下一代测序数据中检测拷贝数变异的有效方法。
Front Bioeng Biotechnol. 2022 Dec 1;10:1000638. doi: 10.3389/fbioe.2022.1000638. eCollection 2022.
5
Robust and exact structural variation detection with paired-end and soft-clipped alignments: SoftSV compared with eight algorithms.利用双端和软剪切比对进行稳健且精确的结构变异检测:SoftSV与八种算法的比较
Brief Bioinform. 2016 Jan;17(1):51-62. doi: 10.1093/bib/bbv028. Epub 2015 May 20.
6
Bellerophon: a hybrid method for detecting interchromosomal rearrangements at base pair resolution using next-generation sequencing data.贝勒罗丰:一种使用下一代测序数据在碱基对分辨率下检测染色体间重排的混合方法。
BMC Bioinformatics. 2013;14 Suppl 5(Suppl 5):S6. doi: 10.1186/1471-2105-14-S5-S6. Epub 2013 Apr 10.
7
Detection of recurrent rearrangement breakpoints from copy number data.从拷贝数数据中检测重现的重排断点。
BMC Bioinformatics. 2011 Apr 21;12:114. doi: 10.1186/1471-2105-12-114.
8
Evaluation of copy number variant detection from panel-based next-generation sequencing data.基于面板的下一代测序数据中拷贝数变异检测的评估
Mol Genet Genomic Med. 2019 Jan;7(1):e00513. doi: 10.1002/mgg3.513. Epub 2018 Nov 22.
9
Detection and assessment of copy number variation using PacBio long-read and Illumina sequencing in New Zealand dairy cattle.利用 PacBio 长读测序和 Illumina 测序技术在新西兰奶牛中检测和评估拷贝数变异。
J Dairy Sci. 2017 Jul;100(7):5472-5478. doi: 10.3168/jds.2016-12199. Epub 2017 Apr 27.
10
Mutation spectrum of Drosophila CNVs revealed by breakpoint sequencing.通过断点测序揭示的果蝇拷贝数变异(CNV)的突变谱
Genome Biol. 2012 Dec 22;13(12):R119. doi: 10.1186/gb-2012-13-12-r119.

引用本文的文献

1
A Hitchhiker Guide to Structural Variant Calling: A Comprehensive Benchmark Through Different Sequencing Technologies.结构变异检测指南:通过不同测序技术的全面基准测试
Biomedicines. 2025 Aug 9;13(8):1949. doi: 10.3390/biomedicines13081949.
2
Comparative study of tools for copy number variation detection using next-generation sequencing data.使用下一代测序数据进行拷贝数变异检测工具的比较研究
Sci Rep. 2025 Jul 1;15(1):22145. doi: 10.1038/s41598-025-06527-3.
3
SVEA: an accurate model for structural variation detection using multi-channel image encoding and enhanced AlexNet architecture.

本文引用的文献

1
A survey of copy-number variation detection tools based on high-throughput sequencing data.基于高通量测序数据的拷贝数变异检测工具综述。
Curr Protoc Hum Genet. 2012 Oct;Chapter 7:Unit7.19. doi: 10.1002/0471142905.hg0719s75.
2
DELLY: structural variant discovery by integrated paired-end and split-read analysis.DELLY:通过整合的 paired-end 和 split-read 分析进行结构变异发现。
Bioinformatics. 2012 Sep 15;28(18):i333-i339. doi: 10.1093/bioinformatics/bts378.
3
Statistical challenges associated with detecting copy number variations with next-generation sequencing.
SVEA:一种使用多通道图像编码和增强型AlexNet架构进行结构变异检测的精确模型。
J Transl Med. 2025 Feb 22;23(1):221. doi: 10.1186/s12967-025-06213-y.
4
Biallelic GGGCC repeat expansion leading to NAXE-related mitochondrial encephalopathy.双等位基因GGGCC重复序列扩增导致与NAXE相关的线粒体脑病。
NPJ Genom Med. 2024 Oct 25;9(1):48. doi: 10.1038/s41525-024-00429-5.
5
Detection of trait-associated structural variations using short-read sequencing.利用短读长测序检测性状相关的结构变异
Cell Genom. 2023 May 18;3(6):100328. doi: 10.1016/j.xgen.2023.100328. eCollection 2023 Jun 14.
6
GREPore-seq: A Robust Workflow to Detect Changes After Gene Editing Through Long-range PCR and Nanopore Sequencing.GREPore-seq:通过长距离 PCR 和纳米孔测序检测基因编辑后变化的稳健工作流程。
Genomics Proteomics Bioinformatics. 2023 Dec;21(6):1221-1236. doi: 10.1016/j.gpb.2022.06.002. Epub 2022 Jun 23.
7
Whole genome sequencing of 45 Japanese patients with intellectual disability.45 例日本智力障碍患者的全基因组测序
Am J Med Genet A. 2021 May;185(5):1468-1480. doi: 10.1002/ajmg.a.62138. Epub 2021 Feb 24.
8
RKDOSCNV: A Local Kernel Density-Based Approach to the Detection of Copy Number Variations by Using Next-Generation Sequencing Data.RKDOSCNV:一种基于局部核密度的方法,用于利用下一代测序数据检测拷贝数变异。
Front Genet. 2020 Nov 4;11:569227. doi: 10.3389/fgene.2020.569227. eCollection 2020.
9
Identification of copy number variation in French dairy and beef breeds using next-generation sequencing.利用下一代测序技术鉴定法国奶牛和肉牛品种的拷贝数变异
Genet Sel Evol. 2017 Oct 24;49(1):77. doi: 10.1186/s12711-017-0352-z.
10
Lightning-fast genome variant detection with GROM.利用 GROM 实现快速基因组变异检测。
Gigascience. 2017 Oct 1;6(10):1-7. doi: 10.1093/gigascience/gix091.
与下一代测序检测拷贝数变异相关的统计挑战。
Bioinformatics. 2012 Nov 1;28(21):2711-8. doi: 10.1093/bioinformatics/bts535. Epub 2012 Aug 31.
4
PRISM: pair-read informed split-read mapping for base-pair level detection of insertion, deletion and structural variants.PRISM:基于双读信息的分读比对算法,用于检测插入、缺失和结构变异的碱基对水平。
Bioinformatics. 2012 Oct 15;28(20):2576-83. doi: 10.1093/bioinformatics/bts484. Epub 2012 Jul 31.
5
SVM²: an improved paired-end-based tool for the detection of small genomic structural variations using high-throughput single-genome resequencing data.SVM²:一种改进的基于配对末端的工具,用于使用高通量单基因组重测序数据检测小型基因组结构变异。
Nucleic Acids Res. 2012 Oct;40(18):e145. doi: 10.1093/nar/gks606. Epub 2012 Jun 25.
6
Exploring single-sample SNP and INDEL calling with whole-genome de novo assembly.利用全基因组从头组装进行单样本 SNP 和 INDEL 调用的探索。
Bioinformatics. 2012 Jul 15;28(14):1838-44. doi: 10.1093/bioinformatics/bts280. Epub 2012 May 7.
7
Fast gapped-read alignment with Bowtie 2.快速缺口读对准与 Bowtie 2。
Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth.1923.
8
CREST maps somatic structural variation in cancer genomes with base-pair resolution.CREST 以碱基对分辨率绘制癌症基因组中的体细胞结构变异图谱。
Nat Methods. 2011 Jun 12;8(8):652-4. doi: 10.1038/nmeth.1628.
9
A framework for variation discovery and genotyping using next-generation DNA sequencing data.利用下一代 DNA 测序数据进行变异发现和基因分型的框架。
Nat Genet. 2011 May;43(5):491-8. doi: 10.1038/ng.806. Epub 2011 Apr 10.
10
Genome structural variation discovery and genotyping.基因组结构变异发现与基因分型。
Nat Rev Genet. 2011 May;12(5):363-76. doi: 10.1038/nrg2958. Epub 2011 Mar 1.