结合辅助信息的分布式表达数量性状位点分析。
Distributed eQTL analysis with auxiliary information.
作者信息
Fang Zhiwen, Li Gen, Li Wendong, Pu Xiaolong, Xiang Dongdong
机构信息
KLATASDS-MOE, School of Statistics, East China Normal University, Shanghai, China.
Department of Biostatistics, University of Michigan, Ann Arbor, USA.
出版信息
J Stat Plan Inference. 2024 Jan;228:34-45. doi: 10.1016/j.jspi.2023.06.003. Epub 2023 Jun 28.
Expression quantitative trait locus (eQTL) analysis is a useful tool to identify genetic loci that are associated with gene expression levels. Large collaborative efforts such as the Genotype-Tissue Expression (GTEx) project provide valuable resources for eQTL analysis in different tissues. Most existing methods, however, either focus on one tissue at a time, or analyze multiple tissues to identify eQTLs jointly present in multiple tissues. There is a lack of powerful methods to identify eQTLs in a target tissue while effectively borrowing strength from auxiliary tissues. In this paper, we propose a novel statistical framework to improve the eQTL detection efficacy in the tissue of interest with auxiliary information from other tissues. This framework can enhance the power of the hypothesis test for eQTL effects by incorporating shared and specific effects from multiple tissues into the test statistics. We also devise data-driven and distributed computing approaches for efficient implementation of eQTL detection when the number of tissues is large. Numerical studies in simulation demonstrate the efficacy of the proposed method, and the real data analysis of the GTEx example provides novel insights into eQTL findings in different tissues.
表达数量性状基因座(eQTL)分析是一种用于识别与基因表达水平相关的遗传位点的有用工具。诸如基因型-组织表达(GTEx)项目之类的大型合作项目为不同组织中的eQTL分析提供了宝贵资源。然而,大多数现有方法要么一次只关注一个组织,要么分析多个组织以识别多个组织中共同存在的eQTL。缺乏强大的方法来在目标组织中识别eQTL,同时有效地从辅助组织中借用力量。在本文中,我们提出了一种新颖的统计框架,利用来自其他组织的辅助信息来提高感兴趣组织中eQTL的检测效率。该框架可以通过将多个组织的共享和特定效应纳入检验统计量来增强对eQTL效应的假设检验的功效。当组织数量很大时,我们还设计了数据驱动和分布式计算方法来高效实施eQTL检测。模拟中的数值研究证明了所提出方法的有效性,并且GTEx示例的实际数据分析为不同组织中的eQTL发现提供了新的见解。