School of Software, East China Jiaotong University, Nanchang, Jiangxi, 330013, China.
School of Electrical and Automation Engineering, East China Jiaotong University, Nanchang, Jiangxi, 330013, China.
Sci Rep. 2018 Apr 18;8(1):6155. doi: 10.1038/s41598-018-24588-5.
Understanding genetic mechanism of complex diseases is a serious challenge. Existing methods often neglect the heterogeneity phenomenon of complex diseases, resulting in lack of power or low reproducibility. Addressing heterogeneity when detecting epistatic single nucleotide polymorphisms (SNPs) can enhance the power of association studies and improve prediction performance of complex diseases diagnosis. In this study, we propose a three-stage framework including epistasis detection, clustering and prediction to address both epistasis and heterogeneity of complex diseases based on deep learning method. The epistasis detection stage applies a multi-objective optimization method to find several candidate sets of epistatic SNPs which contribute to different subtypes of complex diseases. Then, a K-means clustering algorithm is used to define subtypes of the case group. Finally, a deep learning model has been trained for disease prediction based on graphics processing unit (GPU). Experimental results on pure and heterogeneous datasets show that our method has potential practicality and can serve as a possible alternative to other methods. Therefore, when epistasis and heterogeneity exist at the same time, our method is especially suitable for diagnosis of complex diseases.
理解复杂疾病的遗传机制是一个严峻的挑战。现有的方法往往忽略了复杂疾病的异质性现象,导致缺乏效力或低重现性。在检测上位单核苷酸多态性(SNP)时解决异质性问题,可以增强关联研究的效力,并提高复杂疾病诊断的预测性能。在本研究中,我们提出了一个包括上位检测、聚类和预测的三阶段框架,基于深度学习方法来解决复杂疾病的上位和异质性问题。上位检测阶段应用一种多目标优化方法来寻找多个候选的上位 SNP 集,这些 SNP 集有助于不同类型的复杂疾病。然后,采用 K-均值聚类算法来定义病例组的亚型。最后,基于图形处理单元(GPU)训练了一个深度学习模型来进行疾病预测。在纯数据集和异质数据集上的实验结果表明,我们的方法具有潜在的实用性,可以作为其他方法的一种可能替代方案。因此,当下位和异质性同时存在时,我们的方法特别适用于复杂疾病的诊断。