Data Science Platform, Imagine Institute, Université de Paris Cité, Inserm UMR 1163, Paris, France.
Inserm, Centre de Recherche des Cordeliers, Sorbonne Université, Université de Paris Cité, Paris, France.
Stud Health Technol Inform. 2023 May 18;302:1037-1041. doi: 10.3233/SHTI230342.
In the context of medical concept extraction, it is critical to determine if clinical signs or symptoms mentioned in the text were present or absent, experienced by the patient or their relatives. Previous studies have focused on the NLP aspect but not on how to leverage this supplemental information for clinical applications. In this paper, we aim to use the patient similarity networks framework to aggregate different phenotyping modalities. NLP techniques were applied to extract phenotypes and predict their modalities from 5470 narrative reports of 148 patients with ciliopathies (a group of rare diseases). Patient similarities were computed using each modality separately for aggregation and clustering. We found that aggregating negated phenotypes improved patient similarity, but further aggregating relatives' phenotypes worsened the result. We suggest that different modalities of phenotypes can contribute to patient similarity, but they should be aggregated carefully and with appropriate similarity metrics and aggregation models.
在医学概念提取的背景下,确定文本中提到的临床体征或症状是否存在、患者或其亲属是否经历过这些症状是至关重要的。先前的研究集中在自然语言处理方面,但没有研究如何利用这些补充信息进行临床应用。在本文中,我们旨在使用患者相似性网络框架来聚合不同的表型模式。应用自然语言处理技术从 148 名纤毛病患者的 5470 份叙述报告中提取表型并预测其模式。使用每种模式分别计算患者相似性,以进行聚合和聚类。我们发现,聚合否定表型可以提高患者相似性,但进一步聚合亲属的表型会降低结果。我们建议,不同的表型模式可以为患者相似性做出贡献,但应仔细聚合,并使用适当的相似性度量和聚合模型。