Chung Ren-Hua, Tsai Wei-Yun, Martin Eden R
Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli, Taiwan.
Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, Miami, Florida, United States of America.
PLoS One. 2014 Sep 22;9(9):e107800. doi: 10.1371/journal.pone.0107800. eCollection 2014.
Current family-based association tests for sequencing data were mainly developed for identifying rare variants associated with a complex disease. As the disease can be influenced by the joint effects of common and rare variants, common variants with modest effects may not be identified by the methods focusing on rare variants. Moreover, variants can have risk, neutral, or protective effects. Association tests that can effectively select groups of common and rare variants that are likely to be causal and consider the directions of effects have become important. We developed the Ordered Subset - Variable Threshold - Pedigree Disequilibrium Test (OVPDT), a combination of three algorithms, for association analysis in family sequencing data. The ordered subset algorithm is used to select a subset of common variants based on their relative risks, calculated using only parental mating types. The variable threshold algorithm is used to search for an optimal allele frequency threshold such that rare variants below the threshold are more likely to be causal. The PDT statistics from both rare and common variants selected by the two algorithms are combined as the OVPDT statistic. A permutation procedure is used in OVPDT to calculate the p-value. We used simulations to demonstrate that OVPDT has the correct type I error rates under different scenarios and compared the power of OVPDT with two other family-based association tests. The results suggested that OVPDT can have more power than the other tests if both common and rare variants have effects on the disease in a region.
当前用于测序数据的基于家系的关联检验主要是为了识别与复杂疾病相关的罕见变异而开发的。由于疾病可能受到常见变异和罕见变异的联合作用影响,那些效应较小的常见变异有可能无法通过专注于罕见变异的方法识别出来。此外,变异可能具有风险、中性或保护作用。能够有效选择可能具有因果关系的常见和罕见变异组并考虑效应方向的关联检验变得至关重要。我们开发了有序子集 - 可变阈值 - 家系不平衡检验(OVPDT),它是三种算法的组合,用于家系测序数据的关联分析。有序子集算法用于根据仅使用亲本交配类型计算出的相对风险来选择常见变异的一个子集。可变阈值算法用于搜索一个最优的等位基因频率阈值,使得低于该阈值的罕见变异更有可能具有因果关系。由这两种算法选择出的罕见和常见变异的PDT统计量被合并为OVPDT统计量。在OVPDT中使用排列程序来计算p值。我们通过模拟证明了OVPDT在不同情况下具有正确的I型错误率,并将OVPDT的效能与其他两种基于家系的关联检验进行了比较。结果表明,如果一个区域内的常见和罕见变异都对疾病有影响,那么OVPDT可能比其他检验具有更高的效能。