Department of Statistics, 2647The Ohio State University, Columbus, OH, USA.
Battelle Center for Mathematical Medicine, Nationwide Children's Hospital, Columbus, OH, USA.
Stat Methods Med Res. 2020 Nov;29(11):3340-3350. doi: 10.1177/0962280220927728. Epub 2020 Jun 4.
Haplotype-based association methods have been developed to understand the genetic architecture of complex diseases. Compared to single-variant-based methods, haplotype methods are thought to be more biologically relevant, since there are typically multiple non-independent genetic variants involved in complex diseases, and the use of haplotypes implicitly accounts for non-independence caused by linkage disequilibrium. In recent years, with the focus moving from common to rare variants, haplotype-based methods have also evolved accordingly to uncover the roles of rare haplotypes. One particular approach is regularization-based, with the use of Bayesian least absolute shrinkage and selection operator (Lasso) as an example. This type of methods has been developed for either case-control population data (the logistic Bayesian Lasso (LBL)) or family data (family-triad-based logistic Bayesian Lasso (famLBL)). In some situations, both family data and case-control data are available; therefore, it would be a waste of resources if only one of them could be analyzed. To make full usage of available data to increase power, we propose a unified approach that can combine both case-control and family data (combined logistic Bayesian Lasso (cLBL)). Through simulations, we characterized the performance of cLBL and showed the advantage of cLBL over existing methods. We further applied cLBL to the Framingham Heart Study data to demonstrate its utility in real data applications.
基于单倍型的关联方法已被开发出来,用于理解复杂疾病的遗传结构。与基于单变量的方法相比,单倍型方法被认为更具有生物学相关性,因为复杂疾病通常涉及多个非独立的遗传变异,而使用单倍型则隐含地考虑了由连锁不平衡引起的非独立性。近年来,随着研究重点从常见变异转移到罕见变异,基于单倍型的方法也相应地发展起来,以揭示罕见单倍型的作用。一种特殊的方法是基于正则化的,以贝叶斯最小绝对收缩和选择算子(Lasso)为例。这种方法是为病例对照人群数据(逻辑贝叶斯 Lasso(LBL))或家系数据(基于家系三联体的逻辑贝叶斯 Lasso(famLBL))开发的。在某些情况下,既有家系数据又有病例对照数据;因此,如果只能分析其中一种,那将是对资源的浪费。为了充分利用现有数据来提高效能,我们提出了一种统一的方法,可以同时结合病例对照和家系数据(联合逻辑贝叶斯 Lasso(cLBL))。通过模拟,我们描述了 cLBL 的性能,并展示了 cLBL 相对于现有方法的优势。我们进一步将 cLBL 应用于弗雷明汉心脏研究数据,以证明其在真实数据应用中的实用性。