• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

真空:由重组载体污染引起的假体变异的识别和过滤。

Vecuum: identification and filtration of false somatic variants caused by recombinant vector contamination.

机构信息

Severance Biomedical Science Institute, Brain Korea 21 PLUS Project for Medical Sciences, Yonsei University College of Medicine, Seoul 03722, South Korea.

Graduate School of Medical Science and Engineering, KAIST, Daejeon 34141, South Korea.

出版信息

Bioinformatics. 2016 Oct 15;32(20):3072-3080. doi: 10.1093/bioinformatics/btw383. Epub 2016 Jun 22.

DOI:10.1093/bioinformatics/btw383
PMID:27334474
Abstract

MOTIVATION

Advances in sequencing technologies have remarkably lowered the detection limit of somatic variants to a low frequency. However, calling mutations at this range is still confounded by many factors including environmental contamination. Vector contamination is a continuously occurring issue and is especially problematic since vector inserts are hardly distinguishable from the sample sequences. Such inserts, which may harbor polymorphisms and engineered functional mutations, can result in calling false variants at corresponding sites. Numerous vector-screening methods have been developed, but none could handle contamination from inserts because they are focusing on vector backbone sequences alone.

RESULTS

We developed a novel method-Vecuum-that identifies vector-originated reads and resultant false variants. Since vector inserts are generally constructed from intron-less cDNAs, Vecuum identifies vector-originated reads by inspecting the clipping patterns at exon junctions. False variant calls are further detected based on the biased distribution of mutant alleles to vector-originated reads. Tests on simulated and spike-in experimental data validated that Vecuum could detect 93% of vector contaminants and could remove up to 87% of variant-like false calls with 100% precision. Application to public sequence datasets demonstrated the utility of Vecuum in detecting false variants resulting from various types of external contamination.

AVAILABILITY AND IMPLEMENTATION

Java-based implementation of the method is available at http://vecuum.sourceforge.net/ CONTACT: swkim@yuhs.acSupplementary information: Supplementary data are available at Bioinformatics online.

摘要

动机

测序技术的进步显著降低了体细胞变异的检测下限至低频率。然而,在这个范围内调用突变仍然受到许多因素的影响,包括环境污染。载体污染是一个持续存在的问题,尤其是因为载体插入物几乎无法与样本序列区分开来。这些插入物可能含有多态性和工程功能突变,可能导致在相应位点产生假变体。已经开发了许多载体筛选方法,但由于它们仅专注于载体骨架序列,因此没有一种方法可以处理来自插入物的污染。

结果

我们开发了一种新的方法-Vecuum-,它可以识别源自载体的读取序列和由此产生的假变体。由于载体插入物通常由无内含子的 cDNA 构建,因此 Vecuum 通过检查外显子连接处的剪辑模式来识别源自载体的读取序列。根据突变等位基因向源自载体的读取序列的偏置分布,进一步检测假变体调用。对模拟和 Spike-in 实验数据的测试验证了 Vecuum 可以检测到 93%的载体污染物,并可以去除高达 87%的具有 100%精度的类似变体的假呼叫。将其应用于公共序列数据集表明了 Vecuum 在检测由于各种类型的外部污染而导致的假变体方面的实用性。

可用性和实施

该方法的基于 Java 的实现可在 http://vecuum.sourceforge.net/ 上获得。

联系方式

swkim@yuhs.ac

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

1
Vecuum: identification and filtration of false somatic variants caused by recombinant vector contamination.真空:由重组载体污染引起的假体变异的识别和过滤。
Bioinformatics. 2016 Oct 15;32(20):3072-3080. doi: 10.1093/bioinformatics/btw383. Epub 2016 Jun 22.
2
AIRVF: a filtering toolbox for precise variant calling in Ion Torrent sequencing.AIRVF:一个用于 Ion Torrent 测序中精确变异调用的过滤工具箱。
Bioinformatics. 2018 Apr 1;34(7):1232-1234. doi: 10.1093/bioinformatics/btx719.
3
SoloDel: a probabilistic model for detecting low-frequent somatic deletions from unmatched sequencing data.SoloDel:一种用于从未匹配测序数据中检测低频体细胞缺失的概率模型。
Bioinformatics. 2015 Oct 1;31(19):3105-13. doi: 10.1093/bioinformatics/btv358. Epub 2015 Jun 11.
4
SomVarIUS: somatic variant identification from unpaired tissue samples.SomVarIUS:从非配对组织样本中进行体细胞变异识别。
Bioinformatics. 2016 Mar 15;32(6):808-13. doi: 10.1093/bioinformatics/btv685. Epub 2015 Nov 20.
5
SiNVICT: ultra-sensitive detection of single nucleotide variants and indels in circulating tumour DNA.SiNVICT:循环肿瘤 DNA 中单核苷酸变异和插入缺失的超灵敏检测。
Bioinformatics. 2017 Jan 1;33(1):26-34. doi: 10.1093/bioinformatics/btw536. Epub 2016 Aug 16.
6
Mutascope: sensitive detection of somatic mutations from deep amplicon sequencing.Mutascope:从深度扩增子测序中灵敏检测体细胞突变。
Bioinformatics. 2013 Aug 1;29(15):1908-9. doi: 10.1093/bioinformatics/btt305. Epub 2013 May 27.
7
VarSim: a high-fidelity simulation and validation framework for high-throughput genome sequencing with cancer applications.VarSim:一个用于癌症相关高通量基因组测序的高保真模拟与验证框架。
Bioinformatics. 2015 May 1;31(9):1469-71. doi: 10.1093/bioinformatics/btu828. Epub 2014 Dec 17.
8
ReliableGenome: annotation of genomic regions with high/low variant calling concordance.可靠基因组:具有高/低变异检测一致性的基因组区域注释。
Bioinformatics. 2017 Jan 15;33(2):155-160. doi: 10.1093/bioinformatics/btw587. Epub 2016 Sep 7.
9
SAMSVM: A tool for misalignment filtration of SAM-format sequences with support vector machine.SAMSVM:一种利用支持向量机对SAM格式序列进行错配过滤的工具。
J Bioinform Comput Biol. 2015 Dec;13(6):1550025. doi: 10.1142/S0219720015500250. Epub 2015 Aug 24.
10
QuASAR: quantitative allele-specific analysis of reads.QuASAR:读取的定量等位基因特异性分析。
Bioinformatics. 2015 Apr 15;31(8):1235-42. doi: 10.1093/bioinformatics/btu802. Epub 2014 Dec 4.

引用本文的文献

1
Analysis of low-level somatic mosaicism reveals stage and tissue-specific mutational features in human development.分析低水平体细胞嵌合现象揭示了人类发育过程中的阶段和组织特异性突变特征。
PLoS Genet. 2022 Sep 19;18(9):e1010404. doi: 10.1371/journal.pgen.1010404. eCollection 2022 Sep.
2
cDNA-detector: detection and removal of cDNA contamination in DNA sequencing libraries.cDNA-detector:DNA 测序文库中 cDNA 污染的检测和去除。
BMC Bioinformatics. 2021 Dec 24;22(1):611. doi: 10.1186/s12859-021-04529-2.
3
VecScreen_plus_taxonomy: imposing a tax(onomy) increase on vector contamination screening.
VecScreen_plus_taxonomy:对载体污染筛查施加分类学税(onomy)增加。
Bioinformatics. 2018 Mar 1;34(5):755-759. doi: 10.1093/bioinformatics/btx669.