Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN 38105;
Curr Genomics. 2013 Jun;14(4):250-5. doi: 10.2174/13892029113149990001.
The study of gene-based genetic associations has gained conceptual popularity recently. Biologic insight into the etiology of a complex disease can be gained by focusing on genes as testing units. Several gene-based methods (e.g., minimum p-value (or maximum test statistic) or entropy-based method) have been developed and have more power than a single nucleotide polymorphism (SNP)-based analysis. The objective of this study is to compare the performance of the entropy-based method with the minimum p-value and single SNP-based analysis and to explore their strengths and weaknesses. Simulation studies show that: 1) all three methods can reasonably control the false-positive rate; 2) the minimum p-value method outperforms the entropy-based and the single SNP-based method when only one disease-related SNP occurs within the gene; 3) the entropy-based method outperforms the other methods when there are more than two disease-related SNPs in the gene; and 4) the entropy-based method is computationally more efficient than the minimum p-value method. Application to a real data set shows that more significant genes were identified by the entropy-based method than by the other two methods.
近年来,基于基因的遗传关联研究在概念上越来越受欢迎。通过将基因作为检测单位,我们可以深入了解复杂疾病的病因。已经开发出了几种基于基因的方法(例如,最小 p 值(或最大检验统计量)或基于熵的方法),并且它们比基于单核苷酸多态性(SNP)的分析具有更高的功效。本研究的目的是比较基于熵的方法与最小 p 值和单 SNP 分析的性能,并探索它们的优缺点。模拟研究表明:1)所有三种方法都可以合理地控制假阳性率;2)当基因内仅发生一个与疾病相关的 SNP 时,最小 p 值方法优于基于熵的方法和基于单 SNP 的方法;3)当基因中有两个以上与疾病相关的 SNP 时,基于熵的方法优于其他方法;4)与最小 p 值方法相比,基于熵的方法在计算上更加高效。应用于真实数据集的结果表明,基于熵的方法比其他两种方法鉴定出了更多显著的基因。