School of Computer Science & Engineering, Xidian University, Xi'an 710071, China.
IEEE Trans Nanobioscience. 2010 Dec;9(4):232-41. doi: 10.1109/TNB.2010.2070805. Epub 2010 Sep 13.
One of the most challenging points in studying human common complex diseases is to search for both strong and weak susceptibility single-nucleotide polymorphisms (SNPs) and identify forms of genetic disease models. Currently, a number of methods have been proposed for this purpose. Many of them have not been validated through applications into various genome datasets, so their abilities are not clear in real practice. In this paper, we present a novel SNP association study method based on probability theory, called ProbSNP. The method firstly detects SNPs by evaluating their joint probabilities in combining with disease status and selects those with the lowest joint probabilities as susceptibility ones, and then identifies some forms of genetic disease models through testing multiple-locus interactions among the selected SNPs. The joint probabilities of combined SNPs are estimated by establishing Gaussian distribution probability density functions, in which the related parameters (i.e., mean value and standard deviation) are evaluated based on allele and haplotype frequencies. Finally, we test and validate the method using various genome datasets. We find that ProbSNP has shown remarkable success in the applications to both simulated genome data and real genome-wide data.
研究人类常见复杂疾病的最具挑战性的问题之一是寻找强和弱易感性单核苷酸多态性 (SNP),并确定遗传疾病模型的形式。目前,已经提出了许多用于此目的的方法。其中许多方法尚未通过应用于各种基因组数据集进行验证,因此它们在实际实践中的能力尚不清楚。在本文中,我们提出了一种基于概率论的新的 SNP 关联研究方法,称为 ProbSNP。该方法首先通过结合疾病状态评估 SNP 的联合概率,并选择联合概率最低的 SNP 作为易感性 SNP,然后通过测试所选 SNP 之间的多基因座相互作用来识别某些遗传疾病模型。通过建立高斯分布概率密度函数来估计组合 SNP 的联合概率,其中相关参数(即平均值和标准差)是基于等位基因和单倍型频率进行评估的。最后,我们使用各种基因组数据集对该方法进行了测试和验证。我们发现 ProbSNP 在模拟基因组数据和真实全基因组数据的应用中都取得了显著的成功。