Xue Hansheng, Peng Jiajie, Shang Xuequn
School of Computer Science, Northwestern Polytechnical University, Xi'an, China.
School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, China.
BMC Syst Biol. 2019 Apr 5;13(Suppl 2):34. doi: 10.1186/s12918-019-0697-8.
Improving efficiency of disease diagnosis based on phenotype ontology is a critical yet challenging research area. Recently, Human Phenotype Ontology (HPO)-based semantic similarity has been affectively and widely used to identify causative genes and diseases. However, current phenotype similarity measurements just consider the annotations and hierarchy structure of HPO, neglecting the definition description of phenotype terms.
In this paper, we propose a novel phenotype similarity measurement, termed as DisPheno, which adequately incorporates the definition of phenotype terms in addition to HPO structure and annotations to measure the similarity between phenotype terms. DisPheno also integrates phenotype term associations into phenotype-set similarity measurement using gene and disease annotations of phenotype terms.
Compared with five existing state-of-the-art methods, DisPheno shows great performance in HPO-based phenotype semantic similarity measurement and improves the efficiency of disease identification, especially on noisy patients dataset.
基于表型本体提高疾病诊断效率是一个关键但具有挑战性的研究领域。最近,基于人类表型本体(HPO)的语义相似性已被有效且广泛地用于识别致病基因和疾病。然而,当前的表型相似性度量仅考虑HPO的注释和层次结构,而忽略了表型术语的定义描述。
在本文中,我们提出了一种新颖的表型相似性度量方法,称为DisPheno,它除了HPO结构和注释之外,还充分纳入了表型术语的定义来度量表型术语之间的相似性。DisPheno还使用表型术语的基因和疾病注释将表型术语关联整合到表型集相似性度量中。
与现有的五种最先进方法相比,DisPheno在基于HPO的表型语义相似性度量中表现出色,并提高了疾病识别效率,尤其是在噪声患者数据集上。