College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, University of Birmingham, UK; Institute of Translational Medicine, University Hospitals Birmingham, NHS Foundation Trust, UK; University Hospitals Birmingham NHS Foundation Trust, Edgbaston, Birmingham, UK.
College of Medical and Dental Sciences, Institute of Cancer and Genomic Sciences, University of Birmingham, UK; Institute of Translational Medicine, University Hospitals Birmingham, NHS Foundation Trust, UK; University Hospitals Birmingham NHS Foundation Trust, Edgbaston, Birmingham, UK.
Comput Biol Med. 2021 Jun;133:104360. doi: 10.1016/j.compbiomed.2021.104360. Epub 2021 Apr 1.
Ontology-based phenotype profiles have been utilised for the purpose of differential diagnosis of rare genetic diseases, and for decision support in specific disease domains. Particularly, semantic similarity facilitates diagnostic hypothesis generation through comparison with disease phenotype profiles. However, the approach has not been applied for differential diagnosis of common diseases, or generalised clinical diagnostics from uncurated text-derived phenotypes. In this work, we describe the development of an approach for deriving patient phenotype profiles from clinical narrative text, and apply this to text associated with MIMIC-III patient visits. We then explore the use of semantic similarity with those text-derived phenotypes to classify primary patient diagnosis, comparing the use of patient-patient similarity and patient-disease similarity using phenotype-disease profiles previously mined from literature. We also consider a combined approach, in which literature-derived phenotypes are extended with the content of text-derived phenotypes we mined from 500 patients. The results reveal a powerful approach, showing that in one setting, uncurated text phenotypes can be used for differential diagnosis of common diseases, making use of information both inside and outside the setting. While the methods themselves should be explored for further optimisation, they could be applied to a variety of clinical tasks, such as differential diagnosis, cohort discovery, document and text classification, and outcome prediction.
基于本体的表型谱已被用于罕见遗传疾病的鉴别诊断,以及特定疾病领域的决策支持。特别是,语义相似性通过与疾病表型谱的比较来促进诊断假设的生成。然而,该方法尚未应用于常见疾病的鉴别诊断,或未经整理的文本衍生表型的一般临床诊断。在这项工作中,我们描述了一种从临床叙述文本中提取患者表型谱的方法,并将其应用于与 MIMIC-III 患者就诊相关的文本。然后,我们探讨了使用语义相似性与那些文本衍生表型来对主要患者诊断进行分类,比较了使用患者-患者相似性和患者-疾病相似性的方法,使用了先前从文献中挖掘的表型-疾病谱。我们还考虑了一种综合方法,即在其中扩展了文献中挖掘的表型,并扩展了我们从 500 名患者的文本中挖掘的表型。结果显示了一种强大的方法,表明在一种情况下,未经整理的文本表型可用于常见疾病的鉴别诊断,利用了环境内外的信息。虽然需要进一步优化这些方法,但它们可以应用于各种临床任务,如鉴别诊断、队列发现、文档和文本分类以及结果预测。