Gordon Derek, Londono Douglas, Patel Payal, Kim Wonkuk, Finch Stephen J, Heiman Gary A
Department of Genetics, The State University of New Jersey, Piscataway, NJ, USA.
Hum Hered. 2016;81(4):194-209. doi: 10.1159/000457135. Epub 2017 Mar 18.
Our motivation here is to calculate the power of 3 statistical tests used when there are genetic traits that operate under a pleiotropic mode of inheritance and when qualitative phenotypes are defined by use of thresholds for the multiple quantitative phenotypes. Specifically, we formulate a multivariate function that provides the probability that an individual has a vector of specific quantitative trait values conditional on having a risk locus genotype, and we apply thresholds to define qualitative phenotypes (affected, unaffected) and compute penetrances and conditional genotype frequencies based on the multivariate function. We extend the analytic power and minimum-sample-size-necessary (MSSN) formulas for 2 categorical data-based tests (genotype, linear trend test [LTT]) of genetic association to the pleiotropic model. We further compare the MSSN of the genotype test and the LTT with that of a multivariate ANOVA (Pillai). We approximate the MSSN for statistics by linear models using a factorial design and ANOVA. With ANOVA decomposition, we determine which factors most significantly change the power/MSSN for all statistics. Finally, we determine which test statistics have the smallest MSSN. In this work, MSSN calculations are for 2 traits (bivariate distributions) only (for illustrative purposes). We note that the calculations may be extended to address any number of traits. Our key findings are that the genotype test usually has lower MSSN requirements than the LTT. More inclusive thresholds (top/bottom 25% vs. top/bottom 10%) have higher sample size requirements. The Pillai test has a much larger MSSN than both the genotype test and the LTT, as a result of sample selection. With these formulas, researchers can specify how many subjects they must collect to localize genes for pleiotropic phenotypes.
我们这里的动机是计算三种统计检验的效能,这些检验用于存在以多效性遗传模式起作用的遗传性状,以及通过对多个数量性状使用阈值来定义定性表型的情况。具体而言,我们构建了一个多变量函数,该函数给出了个体在具有风险基因座基因型的条件下具有特定数量性状值向量的概率,并且我们应用阈值来定义定性表型(患病、未患病),并基于该多变量函数计算外显率和条件基因型频率。我们将基于2个分类数据的遗传关联检验(基因型、线性趋势检验[LTT])的分析效能和所需最小样本量(MSSN)公式扩展到多效性模型。我们进一步将基因型检验和LTT的MSSN与多变量方差分析(Pillai检验)的MSSN进行比较。我们使用析因设计和方差分析通过线性模型来近似统计量的MSSN。通过方差分析分解,我们确定哪些因素对所有统计量的效能/MSSN变化影响最为显著。最后,我们确定哪些检验统计量的MSSN最小。在这项工作中,MSSN计算仅针对2个性状(双变量分布)(用于说明目的)。我们注意到这些计算可以扩展到处理任意数量的性状。我们的主要发现是,基因型检验通常比LTT具有更低的MSSN要求。更具包容性的阈值(顶部/底部25% 与顶部/底部10%)具有更高的样本量要求。由于样本选择的原因,Pillai检验的MSSN比基因型检验和LTT都大得多。利用这些公式,研究人员可以确定他们必须收集多少受试者才能定位多效性表型的基因。