Department of Biomedical Data Science, Stanford University, Stanford, California, United States of America.
Department of Genetics, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America.
PLoS Genet. 2021 Apr 26;17(4):e1009464. doi: 10.1371/journal.pgen.1009464. eCollection 2021 Apr.
As a type of relatively new methodology, the transcriptome-wide association study (TWAS) has gained interest due to capacity for gene-level association testing. However, the development of TWAS has outpaced statistical evaluation of TWAS gene prioritization performance. Current TWAS methods vary in underlying biological assumptions about tissue specificity of transcriptional regulatory mechanisms. In a previous study from our group, this may have affected whether TWAS methods better identified associations in single tissues versus multiple tissues. We therefore designed simulation analyses to examine how the interplay between particular TWAS methods and tissue specificity of gene expression affects power and type I error rates for gene prioritization. We found that cross-tissue identification of expression quantitative trait loci (eQTLs) improved TWAS power. Single-tissue TWAS (i.e., PrediXcan) had robust power to identify genes expressed in single tissues, but, often found significant associations in the wrong tissues as well (therefore had high false positive rates). Cross-tissue TWAS (i.e., UTMOST) had overall equal or greater power and controlled type I error rates for genes expressed in multiple tissues. Based on these simulation results, we applied a tissue specificity-aware TWAS (TSA-TWAS) analytic framework to look for gene-based associations with pre-treatment laboratory values from AIDS Clinical Trial Group (ACTG) studies. We replicated several proof-of-concept transcriptionally regulated gene-trait associations, including UGT1A1 (encoding bilirubin uridine diphosphate glucuronosyltransferase enzyme) and total bilirubin levels (p = 3.59×10-12), and CETP (cholesteryl ester transfer protein) with high-density lipoprotein cholesterol (p = 4.49×10-12). We also identified several novel genes associated with metabolic and virologic traits, as well as pleiotropic genes that linked plasma viral load, absolute basophil count, and/or triglyceride levels. By highlighting the advantages of different TWAS methods, our simulation study promotes a tissue specificity-aware TWAS analytic framework that revealed novel aspects of HIV-related traits.
作为一种相对较新的方法,转录组关联研究(TWAS)因其具有基因水平关联测试的能力而受到关注。然而,TWAS 的发展已经超过了 TWAS 基因优先级性能的统计评估。目前的 TWAS 方法在转录调控机制的组织特异性的基本生物学假设方面有所不同。在我们之前的研究中,这可能影响了 TWAS 方法是否能更好地识别单一组织与多个组织中的关联。因此,我们设计了模拟分析,以研究特定 TWAS 方法与基因表达组织特异性之间的相互作用如何影响基因优先级排序的功效和 I 型错误率。我们发现,跨组织鉴定表达数量性状基因座(eQTLs)可以提高 TWAS 的功效。单组织 TWAS(即 PrediXcan)具有强大的识别单组织中表达的基因的能力,但也经常在错误的组织中发现显著的关联(因此具有较高的假阳性率)。跨组织 TWAS(即 UTMOST)具有整体相等或更高的功效,并控制了多个组织中表达的基因的 I 型错误率。基于这些模拟结果,我们应用了一种具有组织特异性意识的 TWAS(TSA-TWAS)分析框架,从艾滋病临床试验组(ACTG)的研究中寻找与预处理实验室值相关的基于基因的关联。我们复制了几个经证明的转录调控基因-性状关联,包括 UGT1A1(编码胆红素尿苷二磷酸葡萄糖醛酸基转移酶酶)和总胆红素水平(p=3.59×10-12),以及 CETP(胆固醇酯转移蛋白)与高密度脂蛋白胆固醇(p=4.49×10-12)。我们还发现了一些与代谢和病毒学特征相关的新基因,以及与血浆病毒载量、绝对嗜碱性粒细胞计数和/或甘油三酯水平相关的多效性基因。通过突出不同 TWAS 方法的优势,我们的模拟研究促进了一种具有组织特异性意识的 TWAS 分析框架,该框架揭示了与 HIV 相关特征的新方面。