Biomedical Sciences Graduate Program, University of California, San Diego, La Jolla, CA, USA.
Human Biology Program, J. Craig Venter Institute, La Jolla, CA, USA.
BMC Bioinformatics. 2019 May 27;20(1):265. doi: 10.1186/s12859-019-2753-1.
In standard high throughput sequencing analysis, genetic variants are not assigned to a homologous chromosome of origin. This process, called haplotype phasing, can reveal information important for understanding the relationship between genetic variants and biological phenotypes. For example, in genes that carry multiple heterozygous missense variants, phasing resolves whether one or both gene copies are altered. Here, we present a novel approach to phasing variants that takes advantage of unique properties of paired tumor:normal sequencing data from cancer studies.
VAF phasing uses changes in variant allele frequency (VAF) between tumor and normal samples in regions of somatic chromosomal gain or loss to phase germline variants. We apply VAF phasing to 6180 samples from the Cancer Genome Atlas (TCGA) and demonstrate that our method is highly concordant with other standard phasing methods, and can phase an average of 33% more variants than other read-backed phasing methods. Using variant annotation tools designed to score gene haplotypes, we find a suggestive association between carrying multiple missense variants in a single copy of a cancer predisposition gene and earlier age of cancer diagnosis.
VAF phasing exploits unique properties of tumor genomes to increase the number of germline variants that can be phased over standard read-backed methods in paired tumor:normal samples. Our phase-informed association testing results call attention to the need to develop more tools for assessing the joint effect of multiple genetic variants.
在标准高通量测序分析中,遗传变异并未被分配到同源的染色体来源。这个过程被称为单倍型相位,它可以揭示与遗传变异和生物表型之间关系的重要信息。例如,在携带多个杂合错义变异的基因中,相位解析确定一个或两个基因副本是否发生改变。在这里,我们提出了一种新的相位变异的方法,利用癌症研究中来自配对肿瘤正常测序数据的独特性质。
VAF 相位利用肿瘤和正常样本中变异等位基因频率(VAF)在体细胞染色体增益或丢失区域的变化来相位种系变体。我们将 VAF 相位应用于癌症基因组图谱(TCGA)中的 6180 个样本,并证明我们的方法与其他标准相位方法高度一致,并且可以相位比其他基于读取的相位方法平均多 33%的变体。使用旨在评分基因单倍型的变体注释工具,我们发现携带单个癌症易感性基因中的多个错义变异与癌症诊断年龄较早之间存在提示性关联。
VAF 相位利用肿瘤基因组的独特性质来增加在配对肿瘤正常样本中基于读取的标准相位方法中可以相位的种系变体数量。我们的相位知情关联测试结果引起了人们对开发更多工具以评估多个遗传变异的联合效应的需求。