• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在面向罕见病的临床数据仓库中使用相似性度量来寻找患者:仓库博士与针堆中的针。

Finding patients using similarity measures in a rare diseases-oriented clinical data warehouse: Dr. Warehouse and the needle in the needle stack.

作者信息

Garcelon Nicolas, Neuraz Antoine, Benoit Vincent, Salomon Rémi, Kracker Sven, Suarez Felipe, Bahi-Buisson Nadia, Hadj-Rabia Smail, Fischer Alain, Munnich Arnold, Burgun Anita

机构信息

Institut Imagine, Paris Descartes Université Paris Descartes-Sorbonne Paris Cité, Paris, France; INSERM, Institut Imagine, UMR 1163, Université Paris Descartes, Sorbonne Paris Cité, Paris, France; INSERM, Centre de Recherche des Cordeliers, UMR 1138 Equipe 22, Université Paris Descartes, Sorbonne Paris Cité, Paris, France.

INSERM, Centre de Recherche des Cordeliers, UMR 1138 Equipe 22, Université Paris Descartes, Sorbonne Paris Cité, Paris, France; Département d'informatique médicale, Hôpital Necker-Enfants Malades, Assistance Publique-Hôpitaux de Paris (AP-HP), Université Paris Descartes, Sorbonne Paris Cité, France.

出版信息

J Biomed Inform. 2017 Sep;73:51-61. doi: 10.1016/j.jbi.2017.07.016. Epub 2017 Jul 25.

DOI:10.1016/j.jbi.2017.07.016
PMID:28754522
Abstract

OBJECTIVE

In the context of rare diseases, it may be helpful to detect patients with similar medical histories, diagnoses and outcomes from a large number of cases with automated methods. To reduce the time to find new cases, we developed a method to find similar patients given an index case leveraging data from the electronic health records.

MATERIALS AND METHODS

We used the clinical data warehouse of a children academic hospital in Paris, France (Necker-Enfants Malades), containing about 400,000 patients. Our model was based on a vector space model (VSM) to compute the similarity distance between an index patient and all the patients of the data warehouse. The dimensions of the VSM were built upon Unified Medical Language System concepts extracted from clinical narratives stored in the clinical data warehouse. The VSM was enhanced using three parameters: a pertinence score (TF-IDF of the concepts), the polarity of the concept (negated/not negated) and the minimum number of concepts in common. We evaluated this model by displaying the most similar patients for five different rare diseases: Lowe Syndrome (LOWE), Dystrophic Epidermolysis Bullosa (DEB), Activated PI3K delta Syndrome (APDS), Rett Syndrome (RETT) and Dowling Meara (EBS-DM), from the clinical data warehouse representing 18, 103, 21, 84 and 7 patients respectively.

RESULTS

The percentages of index patients returning at least one true positive similar patient in the Top30 similar patients were 94% for LOWE, 97% for DEB, 86% for APDS, 71% for EBS-DM and 99% for RETT. The mean number of patients with the exact same genetic diseases among the 30 returned patients was 51%.

CONCLUSION

This tool offers new perspectives in a translational context to identify patients for genetic research. Moreover, when new molecular bases are discovered, our strategy will help to identify additional eligible patients for genetic screening.

摘要

目的

在罕见病背景下,使用自动化方法从大量病例中检测具有相似病史、诊断和结局的患者可能会有所帮助。为了缩短寻找新病例的时间,我们开发了一种方法,利用电子健康记录中的数据,在给定索引病例的情况下找到相似患者。

材料与方法

我们使用了法国巴黎一家儿童学术医院(内克尔儿童医院)的临床数据仓库,其中包含约40万名患者。我们的模型基于向量空间模型(VSM)来计算索引患者与数据仓库中所有患者之间的相似性距离。VSM的维度基于从临床数据仓库中存储的临床叙述中提取的统一医学语言系统概念构建。通过三个参数增强VSM:相关性得分(概念的词频-逆文档频率)、概念的极性(否定/未否定)和共同概念的最小数量。我们通过展示来自临床数据仓库中分别代表18、103、21、84和7名患者的五种不同罕见病(洛氏综合征(LOWE)、营养不良性大疱性表皮松解症(DEB)、活化磷脂酰肌醇3激酶δ综合征(APDS)、雷特综合征(RETT)和道林·米拉(EBS-DM))的最相似患者来评估该模型。

结果

在排名前30的相似患者中,返回至少一名真阳性相似患者的索引患者百分比分别为:LOWE为94%,DEB为97%,APDS为86%,EBS-DM为71%,RETT为99%。在返回的30名患者中,患有完全相同遗传疾病的患者平均数量为51%。

结论

该工具在转化背景下为基因研究识别患者提供了新的视角。此外,当发现新的分子基础时,我们的策略将有助于识别更多符合条件的患者进行基因筛查。

相似文献

1
Finding patients using similarity measures in a rare diseases-oriented clinical data warehouse: Dr. Warehouse and the needle in the needle stack.在面向罕见病的临床数据仓库中使用相似性度量来寻找患者:仓库博士与针堆中的针。
J Biomed Inform. 2017 Sep;73:51-61. doi: 10.1016/j.jbi.2017.07.016. Epub 2017 Jul 25.
2
Next generation phenotyping using narrative reports in a rare disease clinical data warehouse.利用罕见病临床数据仓库中的叙述报告进行下一代表型分析。
Orphanet J Rare Dis. 2018 May 31;13(1):85. doi: 10.1186/s13023-018-0830-6.
3
A clinician friendly data warehouse oriented toward narrative reports: Dr. Warehouse.面向叙事报告的临床医生友好型数据仓库:Dr. Warehouse。
J Biomed Inform. 2018 Apr;80:52-63. doi: 10.1016/j.jbi.2018.02.019. Epub 2018 Mar 1.
4
Deep phenotyping unstructured data mining in an extensive pediatric database to unravel a common KCNA2 variant in neurodevelopmental syndromes.深度表型分析广泛儿科数据库中的非结构化数据挖掘,以揭示神经发育综合征中常见的 KCNA2 变异。
Genet Med. 2021 May;23(5):968-971. doi: 10.1038/s41436-020-01039-z. Epub 2021 Jan 26.
5
Phenotypic similarity for rare disease: Ciliopathy diagnoses and subtyping.罕见病表型相似性:纤毛病的诊断和亚型分类。
J Biomed Inform. 2019 Dec;100:103308. doi: 10.1016/j.jbi.2019.103308. Epub 2019 Oct 14.
6
Identification of Similar Patients Through Medical Concept Embedding from Electronic Health Records: A Feasibility Study for Rare Disease Diagnosis.通过电子健康记录中的医学概念嵌入识别相似患者:罕见病诊断的可行性研究。
Stud Health Technol Inform. 2021 May 27;281:600-604. doi: 10.3233/SHTI210241.
7
Improving a full-text search engine: the importance of negation detection and family history context to identify cases in a biomedical data warehouse.改进全文搜索引擎:否定检测和家族病史背景对在生物医学数据仓库中识别病例的重要性。
J Am Med Inform Assoc. 2017 May 1;24(3):607-613. doi: 10.1093/jamia/ocw144.
8
Finding Needles in the Haystack: Identifying Patients with Rare Subtype of Multiple Myeloma Supported by a Data Warehouse and Information Extraction.大海捞针:借助数据仓库和信息提取识别多发性骨髓瘤罕见亚型患者
Stud Health Technol Inform. 2018;253:160-164.
9
Consideration of oral health in rare disease expertise centres: a retrospective study on 39 rare diseases using text mining extraction method.考虑罕见病专业中心的口腔健康:使用文本挖掘提取方法对 39 种罕见病的回顾性研究。
Orphanet J Rare Dis. 2022 Aug 20;17(1):317. doi: 10.1186/s13023-022-02467-7.
10
An Integrated Pipeline for Phenotypic Characterization, Clustering and Visualization of Patient Cohorts in a Rare Disease-Oriented Clinical Data Warehouse.面向罕见病的临床数据仓库中患者队列的表型特征描述、聚类和可视化的综合流程。
Stud Health Technol Inform. 2024 Aug 22;316:1785-1789. doi: 10.3233/SHTI240777.

引用本文的文献

1
Applying artificial intelligence to rare diseases: a literature review highlighting lessons from Fabry disease.将人工智能应用于罕见病:一项以法布里病为例的文献综述
Orphanet J Rare Dis. 2025 Apr 17;20(1):186. doi: 10.1186/s13023-025-03655-x.
2
Charting a course for global progress in PIDs by 2030 - proceedings from the IPOPI global multi-stakeholders' summit (September 2023).绘制到 2030 年全球 PID 进展的路线图——来自 IPOPI 全球利益相关者峰会的会议记录(2023 年 9 月)。
Front Immunol. 2024 Jun 27;15:1430678. doi: 10.3389/fimmu.2024.1430678. eCollection 2024.
3
How Does ChatGPT Use Source Information Compared With Google? A Text Network Analysis of Online Health Information.
ChatGPT 与谷歌相比如何使用来源信息?在线健康信息的文本网络分析。
Clin Orthop Relat Res. 2024 Apr 1;482(4):578-588. doi: 10.1097/CORR.0000000000002995. Epub 2024 Mar 1.
4
Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review.将自然语言处理应用于临床数据仓库中的文本数据:系统评价。
JMIR Med Inform. 2023 Dec 15;11:e42477. doi: 10.2196/42477.
5
Good practices for clinical data warehouse implementation: A case study in France.临床数据仓库实施的良好实践:法国的一个案例研究。
PLOS Digit Health. 2023 Jul 6;2(7):e0000298. doi: 10.1371/journal.pdig.0000298. eCollection 2023 Jul.
6
Construction of Cohorts of Similar Patients From Automatic Extraction of Medical Concepts: Phenotype Extraction Study.基于医学概念自动提取构建相似患者队列:表型提取研究
JMIR Med Inform. 2022 Dec 19;10(12):e42379. doi: 10.2196/42379.
7
Tooth-Related Disease Detection System Based on Panoramic Images and Optimization Through Automation: Development Study.基于全景图像的牙齿相关疾病检测系统及自动化优化:开发研究
JMIR Med Inform. 2022 Oct 31;10(10):e38640. doi: 10.2196/38640.
8
Opportunities and Challenges for Machine Learning in Rare Diseases.机器学习在罕见病领域的机遇与挑战
Front Med (Lausanne). 2021 Oct 5;8:747612. doi: 10.3389/fmed.2021.747612. eCollection 2021.
9
Diagnosis of Rare Diseases: a scoping review of clinical decision support systems.罕见病诊断:临床决策支持系统的范围综述。
Orphanet J Rare Dis. 2020 Sep 24;15(1):263. doi: 10.1186/s13023-020-01536-z.
10
Spinal dysraphism as a new entity in V.A.C.TE.R.L syndrome, resulting in a novel acronym V.A.C.TE.R.L.S.脊髓发育不良作为 V.A.C.T.E.R.L 综合征的一个新实体,导致了一个新的缩写 V.A.C.T.E.R.L.S.
Eur J Pediatr. 2020 Jul;179(7):1121-1129. doi: 10.1007/s00431-020-03609-4. Epub 2020 Feb 13.