Department of Information Engineering, Xijing University, Xi'an, 710123, China.
Center for Computer Science and Information Technology, City University of Hong Kong Dongguan Research Institute, Dongguan, China.
BMC Genomics. 2022 Mar 16;22(Suppl 1):916. doi: 10.1186/s12864-022-08423-w.
Recent evidences have suggested that human microorganisms participate in important biological activities in the human body. The dysfunction of host-microbiota interactions could lead to complex human disorders. The knowledge on host-microbiota interactions can provide valuable insights into understanding the pathological mechanism of diseases. However, it is time-consuming and costly to identify the disorder-specific microbes from the biological "haystack" merely by routine wet-lab experiments. With the developments in next-generation sequencing and omics-based trials, it is imperative to develop computational prediction models for predicting microbe-disease associations on a large scale.
Based on the known microbe-disease associations derived from the Human Microbe-Disease Association Database (HMDAD), the proposed model shows reliable performance with high values of the area under ROC curve (AUC) of 0.9456 and 0.8866 in leave-one-out cross validations and five-fold cross validations, respectively. In case studies of colorectal carcinoma, 80% out of the top-20 predicted microbes have been experimentally confirmed via published literatures.
Based on the assumption that functionally similar microbes tend to share the similar interaction patterns with human diseases, we here propose a group based computational model of Bayesian disease-oriented ranking to prioritize the most potential microbes associating with various human diseases. Based on the sequence information of genes, two computational approaches (BLAST+ and MEGA 7) are leveraged to measure the microbe-microbe similarity from different perspectives. The disease-disease similarity is calculated by capturing the hierarchy information from the Medical Subject Headings (MeSH) data. The experimental results illustrate the accuracy and effectiveness of the proposed model. This work is expected to facilitate the characterization and identification of promising microbial biomarkers.
最近的证据表明,人类微生物参与人体中的重要生物学活动。宿主-微生物相互作用的功能障碍可能导致复杂的人类疾病。对宿主-微生物相互作用的了解可以为理解疾病的病理机制提供有价值的见解。然而,仅仅通过常规的湿实验室实验,从生物“干草堆”中鉴定出与疾病相关的微生物既耗时又昂贵。随着下一代测序和基于组学的试验的发展,迫切需要开发用于大规模预测微生物-疾病关联的计算预测模型。
基于从人类微生物-疾病关联数据库(HMDAD)中得出的已知微生物-疾病关联,所提出的模型在通过留一交叉验证和五重交叉验证分别获得的 ROC 曲线下面积(AUC)的高值 0.9456 和 0.8866 方面表现出可靠的性能。在结直肠癌的案例研究中,通过已发表的文献,前 20 个预测微生物中有 80%得到了实验证实。
基于功能相似的微生物倾向于与人类疾病共享相似的相互作用模式的假设,我们在这里提出了一种基于贝叶斯疾病导向排序的基于群组的计算模型,用于优先考虑与各种人类疾病最相关的最有潜力的微生物。基于基因的序列信息,利用 BLAST+和 MEGA 7 两种计算方法从不同角度测量微生物-微生物的相似性。通过从 MeSH 数据中捕获层次信息来计算疾病-疾病的相似性。实验结果说明了所提出模型的准确性和有效性。这项工作有望促进有前途的微生物生物标志物的表征和鉴定。