• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Conpair:匹配的肿瘤-正常样本对的一致性和污染估计器。

Conpair: concordance and contamination estimator for matched tumor-normal pairs.

作者信息

Bergmann Ewa A, Chen Bo-Juen, Arora Kanika, Vacic Vladimir, Zody Michael C

机构信息

New York Genome Center, New York, NY 10013, USA.

出版信息

Bioinformatics. 2016 Oct 15;32(20):3196-3198. doi: 10.1093/bioinformatics/btw389. Epub 2016 Jun 26.

DOI:10.1093/bioinformatics/btw389
PMID:27354699
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5048070/
Abstract

MOTIVATION

Sequencing of matched tumor and normal samples is the standard study design for reliable detection of somatic alterations. However, even very low levels of cross-sample contamination significantly impact calling of somatic mutations, because contaminant germline variants can be incorrectly interpreted as somatic. There are currently no sequence-only based methods that reliably estimate contamination levels in tumor samples, which frequently display copy number changes. As a solution, we developed Conpair, a tool for detection of sample swaps and cross-individual contamination in whole-genome and whole-exome tumor-normal sequencing experiments.

RESULTS

On a ladder of in silico contaminated samples, we demonstrated that Conpair reliably measures contamination levels as low as 0.1%, even in presence of copy number changes. We also estimated contamination levels in glioblastoma WGS and WXS tumor-normal datasets from TCGA and showed that they strongly correlate with tumor-normal concordance, as well as with the number of germline variants called as somatic by several widely-used somatic callers.

AVAILABILITY AND IMPLEMENTATION

The method is available at: https://github.com/nygenome/conpair CONTACT: egrabowska@gmail.com or mczody@nygenome.orgSupplementary information: Supplementary data are available at Bioinformatics online.

摘要

动机

对匹配的肿瘤样本和正常样本进行测序是可靠检测体细胞改变的标准研究设计。然而,即使是极低水平的跨样本污染也会显著影响体细胞突变的检测,因为污染的种系变异可能会被错误地解释为体细胞变异。目前还没有基于序列的方法能够可靠地估计肿瘤样本中的污染水平,而肿瘤样本经常会出现拷贝数变化。作为一种解决方案,我们开发了Conpair,这是一种用于在全基因组和全外显子肿瘤-正常测序实验中检测样本交换和个体间污染的工具。

结果

在一系列计算机模拟污染样本上,我们证明了Conpair即使在存在拷贝数变化的情况下也能可靠地测量低至0.1%的污染水平。我们还估计了来自TCGA的胶质母细胞瘤全基因组测序(WGS)和全外显子测序(WXS)肿瘤-正常数据集的污染水平,结果表明它们与肿瘤-正常一致性以及几种广泛使用的体细胞变异检测工具误判为体细胞变异的种系变异数量密切相关。

可用性和实现方式

该方法可在以下网址获取:https://github.com/nygenome/conpair 联系方式:egrabowska@gmail.com 或 mczody@nygenome.org 补充信息:补充数据可在《生物信息学》在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/349b/5048070/b222aae15e37/btw389f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/349b/5048070/b222aae15e37/btw389f1p.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/349b/5048070/b222aae15e37/btw389f1p.jpg

相似文献

1
Conpair: concordance and contamination estimator for matched tumor-normal pairs.Conpair:匹配的肿瘤-正常样本对的一致性和污染估计器。
Bioinformatics. 2016 Oct 15;32(20):3196-3198. doi: 10.1093/bioinformatics/btw389. Epub 2016 Jun 26.
2
Canvas: versatile and scalable detection of copy number variants.Canvas:灵活且可扩展的拷贝数变异检测。
Bioinformatics. 2016 Aug 1;32(15):2375-7. doi: 10.1093/bioinformatics/btw163. Epub 2016 Mar 24.
3
Joint detection of germline and somatic copy number events in matched tumor-normal sample pairs.在配对的肿瘤-正常样本对中联合检测种系和体细胞拷贝数事件。
Bioinformatics. 2019 Dec 1;35(23):4955-4961. doi: 10.1093/bioinformatics/btz429.
4
DeepSom: a CNN-based approach to somatic variant calling in WGS samples without a matched normal.DeepSom:一种基于 CNN 的无配对正常样本 WGS 样本体细胞变异calling 方法。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac828.
5
Accurity: accurate tumor purity and ploidy inference from tumor-normal WGS data by jointly modelling somatic copy number alterations and heterozygous germline single-nucleotide-variants.Accurity:通过联合建模体细胞拷贝数改变和杂合性胚系单核苷酸变异,从肿瘤-正常 WGS 数据中准确推断肿瘤纯度和倍性。
Bioinformatics. 2018 Jun 15;34(12):2004-2011. doi: 10.1093/bioinformatics/bty043.
6
CloneCNA: detecting subclonal somatic copy number alterations in heterogeneous tumor samples from whole-exome sequencing data.CloneCNA:从全外显子测序数据中检测异质性肿瘤样本中的亚克隆体细胞拷贝数改变。
BMC Bioinformatics. 2016 Aug 19;17:310. doi: 10.1186/s12859-016-1174-7.
7
A computational approach to distinguish somatic vs. germline origin of genomic alterations from deep sequencing of cancer specimens without a matched normal.一种计算方法,用于从无匹配正常样本的癌症标本深度测序中区分基因组改变的体细胞起源与种系起源。
PLoS Comput Biol. 2018 Feb 7;14(2):e1005965. doi: 10.1371/journal.pcbi.1005965. eCollection 2018 Feb.
8
A method to reduce ancestry related germline false positives in tumor only somatic variant calling.一种在仅肿瘤体细胞变异检测中减少与祖先相关的种系假阳性的方法。
BMC Med Genomics. 2017 Oct 19;10(1):61. doi: 10.1186/s12920-017-0296-8.
9
SMuRF: portable and accurate ensemble prediction of somatic mutations.SMuRF:体细胞突变的便携式精确集成预测
Bioinformatics. 2019 Sep 1;35(17):3157-3159. doi: 10.1093/bioinformatics/btz018.
10
DEFOR: depth- and frequency-based somatic copy number alteration detector.DEFOR:基于深度和频率的体细胞拷贝数改变探测器。
Bioinformatics. 2019 Oct 1;35(19):3824-3825. doi: 10.1093/bioinformatics/btz170.

引用本文的文献

1
Modelling Acral Melanoma in Admixed Brazilians Uncovers Genomic Drivers and Targetable Pathways.对混血巴西人肢端黑色素瘤的建模揭示了基因组驱动因素和可靶向的通路。
medRxiv. 2025 Aug 13:2025.08.08.25332963. doi: 10.1101/2025.08.08.25332963.
2
The mutagenic forces shaping the genomes of lung cancer in never smokers.塑造非吸烟者肺癌基因组的诱变力量。
Nature. 2025 Jul 2. doi: 10.1038/s41586-025-09219-0.
3
Refined Procedure to Purify and Sequence Circulating Cell-Free DNA in Prostate Cancer.前列腺癌中循环游离DNA的纯化及测序优化方法

本文引用的文献

1
EXCAVATOR: detecting copy number variants from whole-exome sequencing data.挖掘者:从全外显子组测序数据中检测拷贝数变异
Genome Biol. 2013;14(10):R120. doi: 10.1186/gb-2013-14-10-r120.
2
The somatic genomic landscape of glioblastoma.胶质母细胞瘤的体细胞基因组景观。
Cell. 2013 Oct 10;155(2):462-77. doi: 10.1016/j.cell.2013.09.034.
3
Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples.检测不纯和异质癌症样本中的体细胞点突变。
Int J Mol Sci. 2025 Jun 18;26(12):5839. doi: 10.3390/ijms26125839.
4
PISAD: reference-free intraspecies sample anomalies detection tool based on k-mer counting.PISAD:基于k-mer计数的无参考种内样本异常检测工具。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf061.
5
APOBEC affects tumor evolution and age at onset of lung cancer in smokers.载脂蛋白B mRNA编辑酶催化多肽样蛋白影响吸烟者肺癌的肿瘤演变和发病年龄。
Nat Commun. 2025 May 21;16(1):4711. doi: 10.1038/s41467-025-59923-8.
6
Wnt/β-catenin activation by mutually exclusive FBXW11 and CTNNB1 hotspot mutations drives salivary basal cell adenoma.相互排斥的FBXW11和CTNNB1热点突变激活Wnt/β-连环蛋白信号通路驱动涎腺基底细胞腺瘤。
Nat Commun. 2025 May 19;16(1):4657. doi: 10.1038/s41467-025-59871-3.
7
Geographic and age variations in mutational processes in colorectal cancer.结直肠癌突变过程中的地理和年龄差异。
Nature. 2025 Apr 23. doi: 10.1038/s41586-025-09025-8.
8
Establishing a comprehensive panel of patient-derived xenograft models for high-grade endometrial carcinoma: molecular subtypes, genetic alterations, and therapeutic target profiling.建立用于高级别子宫内膜癌的全面患者来源异种移植模型面板:分子亚型、基因改变和治疗靶点分析。
Neoplasia. 2025 Jun;64:101158. doi: 10.1016/j.neo.2025.101158. Epub 2025 Apr 7.
9
The complexity of tobacco smoke-induced mutagenesis in head and neck cancer.烟草烟雾诱发头颈部癌症中诱变作用的复杂性。
Nat Genet. 2025 Apr;57(4):884-896. doi: 10.1038/s41588-025-02134-0. Epub 2025 Mar 31.
10
Deciphering lung adenocarcinoma evolution and the role of LINE-1 retrotransposition.解析肺腺癌的演变及LINE-1逆转录转座的作用。
bioRxiv. 2025 Mar 16:2025.03.14.643063. doi: 10.1101/2025.03.14.643063.
Nat Biotechnol. 2013 Mar;31(3):213-9. doi: 10.1038/nbt.2514. Epub 2013 Feb 10.
4
Detecting and estimating contamination of human DNA samples in sequencing and array-based genotype data.检测和估计测序和基于阵列的基因型数据中人 DNA 样本的污染。
Am J Hum Genet. 2012 Nov 2;91(5):839-48. doi: 10.1016/j.ajhg.2012.09.004. Epub 2012 Oct 25.
5
LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets.LoFreq:一种序列质量感知的超灵敏变异 caller,可从高通量测序数据集中揭示细胞群体异质性。
Nucleic Acids Res. 2012 Dec;40(22):11189-201. doi: 10.1093/nar/gks918. Epub 2012 Oct 12.
6
Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs.Strelka:从测序的肿瘤-正常样本对中准确调用体细胞小变异。
Bioinformatics. 2012 Jul 15;28(14):1811-7. doi: 10.1093/bioinformatics/bts271. Epub 2012 May 10.
7
ContEst: estimating cross-contamination of human samples in next-generation sequencing data.ContEst:估计下一代测序数据中人类样本的交叉污染。
Bioinformatics. 2011 Sep 15;27(18):2601-2. doi: 10.1093/bioinformatics/btr446. Epub 2011 Jul 29.