Han Bing, Chen Xue-wen, Talebizadeh Zohreh, Xu Hua
Bioinformatics and Computational Life-Sciences Laboratory, ITTC, Department of Electrical Engineering and Computer Science, University of Kansas, 1520 West 15th Street, Lawrence, KS 66045, USA.
BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S14. doi: 10.1186/1752-0509-6-S3-S14. Epub 2012 Dec 17.
Detecting epistatic interactions plays a significant role in improving pathogenesis, prevention, diagnosis, and treatment of complex human diseases. Applying machine learning or statistical methods to epistatic interaction detection will encounter some common problems, e.g., very limited number of samples, an extremely high search space, a large number of false positives, and ways to measure the association between disease markers and the phenotype.
To address the problems of computational methods in epistatic interaction detection, we propose a score-based Bayesian network structure learning method, EpiBN, to detect epistatic interactions. We apply the proposed method to both simulated datasets and three real disease datasets. Experimental results on simulation data show that our method outperforms some other commonly-used methods in terms of power and sample-efficiency, and is especially suitable for detecting epistatic interactions with weak or no marginal effects. Furthermore, our method is scalable to real disease data.
We propose a Bayesian network-based method, EpiBN, to detect epistatic interactions. In EpiBN, we develop a new scoring function, which can reflect higher-order epistatic interactions by estimating the model complexity from data, and apply a fast Branch-and-Bound algorithm to learn the structure of a two-layer Bayesian network containing only one target node. To make our method scalable to real data, we propose the use of a Markov chain Monte Carlo (MCMC) method to perform the screening process. Applications of the proposed method to some real GWAS (genome-wide association studies) datasets may provide helpful insights into understanding the genetic basis of Age-related Macular Degeneration, late-onset Alzheimer's disease, and autism.
检测上位性相互作用在改善复杂人类疾病的发病机制、预防、诊断和治疗方面发挥着重要作用。将机器学习或统计方法应用于上位性相互作用检测会遇到一些常见问题,例如样本数量非常有限、搜索空间极大、大量假阳性以及测量疾病标志物与表型之间关联的方法。
为了解决上位性相互作用检测中计算方法的问题,我们提出了一种基于分数的贝叶斯网络结构学习方法EpiBN来检测上位性相互作用。我们将所提出的方法应用于模拟数据集和三个真实疾病数据集。模拟数据的实验结果表明,我们的方法在功效和样本效率方面优于其他一些常用方法,尤其适用于检测具有弱边际效应或无边际效应的上位性相互作用。此外,我们的方法可扩展到真实疾病数据。
我们提出了一种基于贝叶斯网络的方法EpiBN来检测上位性相互作用。在EpiBN中,我们开发了一种新的评分函数,该函数可以通过从数据中估计模型复杂性来反映高阶上位性相互作用,并应用快速分支定界算法来学习仅包含一个目标节点的两层贝叶斯网络的结构。为了使我们的方法能够扩展到真实数据,我们提出使用马尔可夫链蒙特卡罗(MCMC)方法来执行筛选过程。将所提出的方法应用于一些真实的全基因组关联研究(GWAS)数据集可能有助于深入了解年龄相关性黄斑变性、晚发性阿尔茨海默病和自闭症的遗传基础。