• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GeDi:应用后缀数组增加肿瘤基因组中可检测 SNV 的种类。

GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes.

机构信息

Department of Computing, Imperial College London, London, SW7 2AZ, UK.

Systems Biology PhD Program, Columbia University in New York City, New York, USA.

出版信息

BMC Bioinformatics. 2020 Feb 5;21(1):45. doi: 10.1186/s12859-020-3367-3.

DOI:10.1186/s12859-020-3367-3
PMID:32024475
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7003401/
Abstract

BACKGROUND

Current popular variant calling pipelines rely on the mapping coordinates of each input read to a reference genome in order to detect variants. Since reads deriving from variant loci that diverge in sequence substantially from the reference are often assigned incorrect mapping coordinates, variant calling pipelines that rely on mapping coordinates can exhibit reduced sensitivity.

RESULTS

In this work we present GeDi, a suffix array-based somatic single nucleotide variant (SNV) calling algorithm that does not rely on read mapping coordinates to detect SNVs and is therefore capable of reference-free and mapping-free SNV detection. GeDi executes with practical runtime and memory resource requirements, is capable of SNV detection at very low allele frequency (<1%), and detects SNVs with high sensitivity at complex variant loci, dramatically outperforming MuTect, a well-established pipeline.

CONCLUSION

By designing novel suffix-array based SNV calling methods, we have developed a practical SNV calling software, GeDi, that can characterise SNVs at complex variant loci and at low allele frequency thus increasing the repertoire of detectable SNVs in tumour genomes. We expect GeDi to find use cases in targeted-deep sequencing analysis, and to serve as a replacement and improvement over previous suffix-array based SNV calling methods.

摘要

背景

当前流行的变异调用管道依赖于将每个输入读取的映射坐标映射到参考基因组,以检测变异。由于源自与参考序列有很大差异的变异基因座的读取通常被赋予不正确的映射坐标,因此依赖于映射坐标的变异调用管道可能会降低灵敏度。

结果

在这项工作中,我们提出了 GeDi,这是一种基于后缀数组的体细胞单核苷酸变异(SNV)调用算法,它不依赖于读取映射坐标来检测 SNV,因此能够进行无参考和无映射的 SNV 检测。GeDi 的执行具有实际的运行时和内存资源要求,能够在非常低的等位基因频率(<1%)下检测 SNV,并且在复杂的变异基因座上具有很高的 SNV 检测灵敏度,大大优于 MuTect,这是一种成熟的管道。

结论

通过设计新颖的基于后缀数组的 SNV 调用方法,我们开发了一种实用的 SNV 调用软件 GeDi,它可以在复杂的变异基因座和低等位基因频率下描述 SNV,从而增加肿瘤基因组中可检测 SNV 的范围。我们期望 GeDi 在靶向深度测序分析中找到用例,并取代和改进以前基于后缀数组的 SNV 调用方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe89/7003401/ee0c5731d34d/12859_2020_3367_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe89/7003401/39d16ab7165c/12859_2020_3367_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe89/7003401/8bf7bd69243a/12859_2020_3367_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe89/7003401/059e1dabead0/12859_2020_3367_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe89/7003401/34f2913ed4b5/12859_2020_3367_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe89/7003401/ee0c5731d34d/12859_2020_3367_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe89/7003401/39d16ab7165c/12859_2020_3367_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe89/7003401/8bf7bd69243a/12859_2020_3367_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe89/7003401/059e1dabead0/12859_2020_3367_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe89/7003401/34f2913ed4b5/12859_2020_3367_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fe89/7003401/ee0c5731d34d/12859_2020_3367_Fig5_HTML.jpg

相似文献

1
GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes.GeDi:应用后缀数组增加肿瘤基因组中可检测 SNV 的种类。
BMC Bioinformatics. 2020 Feb 5;21(1):45. doi: 10.1186/s12859-020-3367-3.
2
Germline contamination and leakage in whole genome somatic single nucleotide variant detection.全基因组体细胞单核苷酸变异检测中的种系污染和渗漏。
BMC Bioinformatics. 2018 Jan 31;19(1):28. doi: 10.1186/s12859-018-2046-0.
3
Whole genome sequencing of 35 individuals provides insights into the genetic architecture of Korean population.对35名个体进行的全基因组测序为了解韩国人群的遗传结构提供了线索。
BMC Bioinformatics. 2014;15 Suppl 11(Suppl 11):S6. doi: 10.1186/1471-2105-15-S11-S6. Epub 2014 Oct 21.
4
Evaluation of variant identification methods for whole genome sequencing data in dairy cattle.奶牛全基因组测序数据变异识别方法的评估
BMC Genomics. 2014 Nov 1;15(1):948. doi: 10.1186/1471-2164-15-948.
5
SNVSniffer: an integrated caller for germline and somatic single-nucleotide and indel mutations.SNVSniffer:一种用于种系和体细胞单核苷酸及插入缺失突变的综合检测工具。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):47. doi: 10.1186/s12918-016-0300-5.
6
SECEDO: SNV-based subclone detection using ultra-low coverage single-cell DNA sequencing.SECEDO:基于 SNV 的亚克隆检测,使用超低覆盖度单细胞 DNA 测序。
Bioinformatics. 2022 Sep 15;38(18):4293-4300. doi: 10.1093/bioinformatics/btac510.
7
Accuracy and reproducibility of somatic point mutation calling in clinical-type targeted sequencing data.临床型靶向测序数据中体细胞点突变calling 的准确性和可重复性。
BMC Med Genomics. 2020 Oct 15;13(1):156. doi: 10.1186/s12920-020-00803-z.
8
Short and long-read genome sequencing methodologies for somatic variant detection; genomic analysis of a patient with diffuse large B-cell lymphoma.短读长读基因组测序方法用于体细胞变异检测;弥漫性大 B 细胞淋巴瘤患者的基因组分析。
Sci Rep. 2021 Mar 19;11(1):6408. doi: 10.1038/s41598-021-85354-8.
9
Longshot enables accurate variant calling in diploid genomes from single-molecule long read sequencing.Longshot 可通过单分子长读测序对二倍体基因组进行准确的变异调用。
Nat Commun. 2019 Oct 11;10(1):4660. doi: 10.1038/s41467-019-12493-y.
10
SiNVICT: ultra-sensitive detection of single nucleotide variants and indels in circulating tumour DNA.SiNVICT:循环肿瘤 DNA 中单核苷酸变异和插入缺失的超灵敏检测。
Bioinformatics. 2017 Jan 1;33(1):26-34. doi: 10.1093/bioinformatics/btw536. Epub 2016 Aug 16.

本文引用的文献

1
A universal SNP and small-indel variant caller using deep neural networks.使用深度神经网络的通用 SNP 和小插入缺失变体调用器。
Nat Biotechnol. 2018 Nov;36(10):983-987. doi: 10.1038/nbt.4235. Epub 2018 Sep 24.
2
ShRangeSim: Simulation of Single Nucleotide Polymorphism Clusters in Next-Generation Sequencing Data.ShRangeSim:下一代测序数据中单核甘酸多态性簇的模拟
J Comput Biol. 2018 Jun;25(6):613-622. doi: 10.1089/cmb.2018.0007. Epub 2018 Apr 16.
3
Prevalence and detection of low-allele-fraction variants in clinical cancer samples.
临床癌症样本中低频等位基因变异的流行和检测。
Nat Commun. 2017 Nov 9;8(1):1377. doi: 10.1038/s41467-017-01470-y.
4
COSMOS: accurate detection of somatic structural variations through asymmetric comparison between tumor and normal samples.COSMOS:通过肿瘤样本与正常样本之间的不对称比较准确检测体细胞结构变异。
Nucleic Acids Res. 2016 May 5;44(8):e78. doi: 10.1093/nar/gkw026. Epub 2016 Feb 1.
5
A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.利用全基因组测序对癌症中体细胞突变检测进行的全面评估。
Nat Commun. 2015 Dec 9;6:10001. doi: 10.1038/ncomms10001.
6
Comprehensive characterization of complex structural variations in cancer by directly comparing genome sequence reads.通过直接比较基因组序列读取,全面描绘癌症中复杂结构变异。
Nat Biotechnol. 2014 Nov;32(11):1106-12. doi: 10.1038/nbt.3027. Epub 2014 Oct 26.
7
Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples.检测不纯和异质癌症样本中的体细胞点突变。
Nat Biotechnol. 2013 Mar;31(3):213-9. doi: 10.1038/nbt.2514. Epub 2013 Feb 10.
8
Fast gapped-read alignment with Bowtie 2.快速缺口读对准与 Bowtie 2。
Nat Methods. 2012 Mar 4;9(4):357-9. doi: 10.1038/nmeth.1923.
9
De novo assembly and genotyping of variants using colored de Bruijn graphs.利用有色 de Bruijn 图进行从头组装和变体基因分型。
Nat Genet. 2012 Jan 8;44(2):226-32. doi: 10.1038/ng.1028.
10
ART: a next-generation sequencing read simulator.ART:一种新一代测序读模拟程序。
Bioinformatics. 2012 Feb 15;28(4):593-4. doi: 10.1093/bioinformatics/btr708. Epub 2011 Dec 23.