Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02142, USA; Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02142, USA; Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA.
Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Cambridge University, Cambridge CB2 1TN, UK.
Am J Hum Genet. 2020 Feb 6;106(2):170-187. doi: 10.1016/j.ajhg.2019.12.011. Epub 2020 Jan 30.
Although quantitative trait locus (QTL) associations have been identified for many molecular traits such as gene expression, it remains challenging to distinguish the causal nucleotide from nearby variants. In addition to traditional QTLs by association, allele-specific (AS) QTLs are a powerful measure of cis-regulation that are concordant with traditional QTLs but typically less susceptible to technical/environmental noise. However, existing methods for estimating causal variant probabilities (i.e., fine mapping) cannot produce valid estimates from asQTL signals due to complexities in linkage disequilibrium (LD). We introduce PLASMA (Population Allele-Specific Mapping), a fine-mapping method that integrates QTL and asQTL information to improve accuracy. In simulations, PLASMA accurately prioritizes causal variants over a wide range of genetic architectures. Applied to RNA-seq data from 524 kidney tumor samples, PLASMA achieves a greater power at 50 samples than conventional QTL-based fine mapping at 500 samples, with more than 17% of loci fine mapped to within five causal variants, compared to 2% by QTL-based fine mapping, and a 6.9-fold overall reduction in median credible set size compared to QTL-based fine mapping when applied to H3K27AC ChIP-seq from just 28 prostate tumor/normal samples. Variants in the PLASMA credible sets for RNA-seq and ChIP-seq were enriched for open chromatin and chromatin looping, respectively, at a comparable or greater degree than credible variants from existing methods while containing far fewer markers. Our results demonstrate how integrating AS activity can substantially improve the detection of causal variants from existing molecular data.
尽管已经确定了许多分子特征(如基因表达)的数量性状基因座(QTL)关联,但从附近变体中区分因果核苷酸仍然具有挑战性。除了通过关联的传统 QTL 外,等位基因特异性(AS)QTL 是顺式调控的有力衡量标准,与传统 QTL 一致,但通常较少受到技术/环境噪声的影响。然而,由于连锁不平衡(LD)的复杂性,用于估计因果变异概率(即精细映射)的现有方法无法从 asQTL 信号中产生有效估计。我们引入了 PLASMA(人群等位基因特异性映射),这是一种精细映射方法,它整合了 QTL 和 asQTL 信息以提高准确性。在模拟中,PLASMA 在广泛的遗传结构中准确地对因果变体进行优先级排序。应用于 524 个肾肿瘤样本的 RNA-seq 数据,PLASMA 在 50 个样本中的功率比传统的基于 QTL 的精细映射在 500 个样本中更高,超过 17%的位点被精细映射到 5 个因果变体以内,而基于 QTL 的精细映射为 2%,与基于 QTL 的精细映射相比,应用于仅 28 个前列腺肿瘤/正常样本的 H3K27AC ChIP-seq 时,中位置信集大小总体减少了 6.9 倍。RNA-seq 和 ChIP-seq 的 PLASMA 置信集变体在开放染色质和染色质环化方面分别比现有方法的置信变体更丰富,程度相当或更大,而包含的标记要少得多。我们的结果表明,如何整合 AS 活性可以大大提高从现有分子数据中检测因果变体的能力。