Department of Statistics and Probability, Michigan State University, East Lansing, Michigan 48824.
Curr Genomics. 2012 Nov;13(7):566-73. doi: 10.2174/138920212803251382.
The availability of high-density single nucleotide polymorphisms (SNPs) data has made the human genetic association studies possible to identify common and rare variants underlying complex diseases in a genome-wide scale. A handful of novel genetic variants have been identified, which gives much hope and prospects for the future of genetic association studies. In this process, statistical and computational methods play key roles, among which information-based association tests have gained large popularity. This paper is intended to give a comprehensive review of the current literature in genetic association analysis casted in the framework of information theory. We focus our review on the following topics: (1) information theoretic approaches in genetic linkage and association studies; (2) entropy-based strategies for optimal SNP subset selection; and (3) the usage of theoretic information criteria in gene clustering and gene regulatory network construction.
高密度单核苷酸多态性 (SNP) 数据的可用性使得全基因组范围内识别复杂疾病相关常见和罕见变异的人类遗传关联研究成为可能。已经鉴定出少数新的遗传变异体,这为遗传关联研究的未来带来了很大的希望和前景。在这个过程中,统计和计算方法起着关键作用,其中基于信息的关联检验得到了广泛的关注。本文旨在对信息论框架下遗传关联分析的现有文献进行全面综述。我们的综述重点关注以下主题:(1) 遗传连锁和关联研究中的信息论方法;(2) 基于熵的最优 SNP 子集选择策略;以及 (3) 理论信息准则在基因聚类和基因调控网络构建中的应用。