Duke-NUS Medical School, Centre for Quantitative Medicine, National University of Singapore, Singapore, Singapore.
Duke Molecular Physiology Institute, Duke University School of Medicine, Durham, North Carolina, USA.
Genet Epidemiol. 2022 Feb;46(1):73-86. doi: 10.1002/gepi.22438. Epub 2021 Nov 15.
Count data with excessive zeros are increasingly ubiquitous in genetic association studies, such as neuritic plaques in brain pathology for Alzheimer's disease. Here, we developed gene-based association tests to model such data by a mixture of two distributions, one for the structural zeros contributed by the Binomial distribution, and the other for the counts from the Poisson distribution. We derived the score statistics of the corresponding parameter of the rare variants in the zero-inflated Poisson regression model, and then constructed burden (ZIP-b) and kernel (ZIP-k) tests for the association tests. We evaluated omnibus tests that combined both ZIP-b and ZIP-k tests. Through simulated sequence data, we illustrated the potential power gain of our proposed method over a two-stage method that analyzes binary and non-zero continuous data separately for both burden and kernel tests. The ZIP burden test outperformed the kernel test as expected in all scenarios except for the scenario of variants with a mixture of directions in the genetic effects. We further demonstrated its applications to analyses of the neuritic plaque data in the ROSMAP cohort. We expect our proposed test to be useful in practice as more powerful than or complementary to the two-stage method.
带有过多零值的计数数据在遗传关联研究中越来越普遍,例如阿尔茨海默病的脑病理学中的神经突斑块。在这里,我们开发了基于基因的关联测试,通过两种分布的混合来对这种数据进行建模,一种分布是由二项式分布贡献的结构零值,另一种分布是泊松分布的计数。我们推导出零膨胀泊松回归模型中稀有变异的相应参数的得分统计量,然后构建了用于关联测试的负担 (ZIP-b) 和核 (ZIP-k) 测试。我们评估了综合 ZIP-b 和 ZIP-k 测试的总体测试。通过模拟序列数据,我们说明了我们提出的方法相对于分别分析二进制和非零连续数据的两阶段方法在负担和核测试方面的潜在优势。ZIP 负担测试在除了遗传效应中变异具有混合方向的情况外,在所有情况下都表现出优于核测试的预期效果。我们进一步证明了它在 ROSMAP 队列中神经突斑块数据分析中的应用。我们期望我们提出的测试在实践中是有用的,因为它比两阶段方法更强大或互补。