Liu Lizhi, Zhu Shanfeng
School of Computer Science, Fudan University, Shanghai, 200433 China.
Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433 China.
Phenomics. 2021 Aug 6;1(4):171-185. doi: 10.1007/s43657-021-00019-w. eCollection 2021 Aug.
Deciphering the relationship between human proteins (genes) and phenotypes is one of the fundamental tasks in phenomics research. The Human Phenotype Ontology (HPO) builds upon a standardized logical vocabulary to describe the abnormal phenotypes encountered in human diseases and paves the way towards the computational analysis of their genetic causes. To date, many computational methods have been proposed to predict the HPO annotations of proteins. In this paper, we conduct a comprehensive review of the existing approaches to predicting HPO annotations of novel proteins, identifying missing HPO annotations, and prioritizing candidate proteins with respect to a certain HPO term. For each topic, we first give the formalized description of the problem, and then systematically revisit the published literatures highlighting their advantages and disadvantages, followed by the discussion on the challenges and promising future directions. In addition, we point out several potential topics to be worthy of exploration including the selection of negative HPO annotations and detecting HPO misannotations. We believe that this review will provide insight to the researchers in the field of computational phenotype analyses in terms of comprehending and developing novel prediction algorithms.
破译人类蛋白质(基因)与表型之间的关系是表型组学研究的基本任务之一。人类表型本体(HPO)基于标准化的逻辑词汇表构建,用于描述人类疾病中出现的异常表型,并为对其遗传原因进行计算分析铺平了道路。迄今为止,已经提出了许多计算方法来预测蛋白质的HPO注释。在本文中,我们对预测新蛋白质的HPO注释、识别缺失的HPO注释以及针对特定HPO术语对候选蛋白质进行优先级排序的现有方法进行了全面综述。对于每个主题,我们首先给出问题的形式化描述,然后系统地回顾已发表的文献,突出它们的优缺点,接着讨论挑战和有前景的未来方向。此外,我们指出了几个值得探索的潜在主题,包括负HPO注释的选择和检测HPO错误注释。我们相信,这篇综述将为计算表型分析领域的研究人员在理解和开发新的预测算法方面提供见解。