Zhang Wei, Yang Liu, Tang Larry L, Liu Aiyi, Mills James L, Sun Yuanchang, Li Qizhai
Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.
Department of Biostatistics, School of Public Health, Yale University, New Haven, CT, USA.
BMC Genomics. 2017 Jul 21;18(1):552. doi: 10.1186/s12864-017-3928-7.
The association studies on human complex traits are admittedly propitious to identify deleterious genetic markers. Compared to single-trait analyses, multiple-trait analyses can arguably make better use of the information on both traits and markers, and thus improve statistical power of association tests prominently. Principal component analysis (PCA) is a well-known useful tool in multivariate analysis and can be applied to this task. Generally, PCA is first performed on all traits and then a certain number of top principal components (PCs) that explain most of the trait variations are selected to construct the test statistics. However, under some situations, only utilizing these top PCs would lead to a loss of important evidences from discarded PCs and thus makes the capability compromised.
To overcome this drawback while keeping the advantages of using the top PCs, we propose a group accumulated test evidence (GATE) procedure. By dividing the PCs which is sorted in the descending order according to the corresponding eigenvalues into a few groups, GATE integrates the information of traits at the group level.
Simulation studies demonstrate the superiority of the proposed approach over several existing methods in terms of statistical power. Sometimes, the increase of power can reach 25%. These methods are further illustrated using the Heterogeneous Stock Mice data which is collected from a quantitative genome-wide association study.
Overall, GATE provides a powerful test for pleiotropic genetic associations.
人类复杂性状的关联研究无疑有助于识别有害的遗传标记。与单性状分析相比,多性状分析可以更好地利用性状和标记两方面的信息,从而显著提高关联检验的统计效力。主成分分析(PCA)是多元分析中一种众所周知的有用工具,可应用于此任务。通常,首先对所有性状进行主成分分析,然后选择一定数量能解释大部分性状变异的顶级主成分(PC)来构建检验统计量。然而,在某些情况下,仅使用这些顶级主成分会导致丢弃的主成分中重要证据的丢失,从而使能力受损。
为了克服这一缺点并保留使用顶级主成分的优势,我们提出了一种分组累积检验证据(GATE)程序。通过将根据相应特征值按降序排列的主成分分成几组,GATE在组水平上整合了性状信息。
模拟研究表明,在统计效力方面,所提出的方法优于几种现有方法。有时,效力的提升可达25%。使用从定量全基因组关联研究中收集的异质品系小鼠数据进一步说明了这些方法。
总体而言,GATE为多效性遗传关联提供了一种强大的检验方法。