Department of Information Engineering, Xijing University, Xi'an, 710123, China.
School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China.
J Transl Med. 2017 Oct 16;15(1):209. doi: 10.1186/s12967-017-1304-7.
Accumulating clinical researches have shown that specific microbes with abnormal levels are closely associated with the development of various human diseases. Knowledge of microbe-disease associations can provide valuable insights for complex disease mechanism understanding as well as the prevention, diagnosis and treatment of various diseases. However, little effort has been made to predict microbial candidates for human complex diseases on a large scale.
In this work, we developed a new computational model for predicting microbe-disease associations by combining two single recommendation methods. Based on the assumption that functionally similar microbes tend to get involved in the mechanism of similar disease, we adopted neighbor-based collaborative filtering and a graph-based scoring method to compute association possibility of microbe-disease pairs. The promising prediction performance could be attributed to the use of hybrid approach based on two single recommendation methods as well as the introduction of Gaussian kernel-based similarity and symptom-based disease similarity.
To evaluate the performance of the proposed model, we implemented leave-one-out and fivefold cross validations on the HMDAD database, which is recently built as the first database collecting experimentally-confirmed microbe-disease associations. As a result, NGRHMDA achieved reliable results with AUCs of 0.9023 ± 0.0031 and 0.9111 in the validation frameworks of fivefold CV and LOOCV. In addition, 78.2% microbe samples and 66.7% disease samples are found to be consistent with the basic assumption of our work that microbes tend to get involved in the similar disease clusters, and vice versa.
Compared with other methods, the prediction results yielded by NGRHMDA demonstrate its effective prediction performance for microbe-disease associations. It is anticipated that NGRHMDA can be used as a useful tool to search the most potential microbial candidates for various diseases, and therefore boosts the medical knowledge and drug development. The codes and dataset of our work can be downloaded from https://github.com/yahuang1991/NGRHMDA .
越来越多的临床研究表明,具有异常水平的特定微生物与各种人类疾病的发展密切相关。对微生物-疾病关联的了解可以为复杂疾病机制的理解以及各种疾病的预防、诊断和治疗提供有价值的见解。然而,在大规模预测人类复杂疾病的微生物候选物方面,几乎没有做出任何努力。
在这项工作中,我们通过结合两种单推荐方法,开发了一种预测微生物-疾病关联的新计算模型。基于功能相似的微生物往往参与类似疾病机制的假设,我们采用基于邻居的协同过滤和基于图的评分方法来计算微生物-疾病对的关联可能性。有前途的预测性能归因于基于两种单推荐方法的混合方法的使用,以及引入基于高斯核的相似性和基于症状的疾病相似性。
为了评估所提出模型的性能,我们在最近建立的第一个收集经过实验证实的微生物-疾病关联的数据库 HMDAD 上实施了留一法和五重交叉验证。结果,NGRHMDA 在五重 CV 和 LOOCV 的验证框架中分别取得了可靠的结果,AUC 分别为 0.9023±0.0031 和 0.9111。此外,78.2%的微生物样本和 66.7%的疾病样本与我们工作的基本假设一致,即微生物倾向于参与类似的疾病簇,反之亦然。
与其他方法相比,NGRHMDA 产生的预测结果表明其对微生物-疾病关联具有有效的预测性能。预计 NGRHMDA 可作为一种有用的工具,用于搜索各种疾病最有潜力的微生物候选物,从而促进医学知识和药物开发。我们工作的代码和数据集可以从 https://github.com/yahuang1991/NGRHMDA 下载。