Goodman Matthew O, Chibnik Lori, Cai Tianxi
Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts.
Genet Epidemiol. 2019 Feb;43(1):82-101. doi: 10.1002/gepi.22162. Epub 2018 Oct 24.
Commonly in biomedical research, studies collect data in which an outcome measure contains informative excess zeros; for example, when observing the burden of neuritic plaques (NPs) in brain pathology studies, those who show none contribute to our understanding of neurodegenerative disease. The outcome may be characterized by a mixture distribution with one component being the "structural zero" and the other component being a Poisson distribution. We propose a novel variance components score test of genetic association between a set of genetic markers and a zero-inflated count outcome from a mixture distribution. This test shares advantageous properties with single-nucleotide polymorphism (SNP)-set tests which have been previously devised for standard continuous or binary outcomes, such as the sequence kernel association test. In particular, our method has superior statistical power compared to competing methods, especially when there is correlation within the group of markers, and when the SNPs are associated with both the mixing proportion and the rate of the Poisson distribution. We apply the method to Alzheimer's data from the Rush University Religious Orders Study and Memory and Aging Project, where as proof of principle we find highly significant associations with the APOE gene, in both the "structural zero" and "count" parameters, when applied to a zero-inflated NPs count outcome.
在生物医学研究中,通常研究收集的数据中,结果测量包含大量信息性的零值;例如,在脑病理学研究中观察神经炎性斑块(NP)负担时,那些未显示出NP的个体有助于我们理解神经退行性疾病。该结果可能由一种混合分布来表征,其中一个成分是“结构零”,另一个成分是泊松分布。我们提出了一种新颖的方差成分得分检验,用于检验一组遗传标记与来自混合分布的零膨胀计数结果之间的遗传关联。该检验与先前为标准连续或二元结果设计的单核苷酸多态性(SNP)集检验具有相同的优势特性,例如序列核关联检验。特别是,与其他竞争方法相比,我们的方法具有更高的统计功效,尤其是当标记组内存在相关性,以及当SNP与混合比例和泊松分布的速率都相关时。我们将该方法应用于拉什大学宗教团体研究和记忆与衰老项目的阿尔茨海默病数据,作为原理验证,当应用于零膨胀的NP计数结果时,我们在“结构零”和“计数”参数中都发现了与APOE基因高度显著的关联。