• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用MonoMiner在数百万份临床记录中发现确诊分子诊断的单基因病患者。

Discovering monogenic patients with a confirmed molecular diagnosis in millions of clinical notes with MonoMiner.

作者信息

Wu David Wei, Bernstein Jonathan A, Bejerano Gill

机构信息

Department of Computer Science, Stanford University School of Engineering, Stanford, CA; Medical Scientist Training Program, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA.

Department of Pediatrics, Stanford University School of Medicine, Stanford, CA.

出版信息

Genet Med. 2022 Oct;24(10):2091-2102. doi: 10.1016/j.gim.2022.07.008. Epub 2022 Aug 17.

DOI:10.1016/j.gim.2022.07.008
PMID:35976265
Abstract

PURPOSE

Cohort building is a powerful foundation for improving clinical care, performing biomedical research, recruiting for clinical trials, and many other applications. We set out to build a cohort of all monogenic patients with a definitive causal gene diagnosis in a 3-million patient hospital system.

METHODS

We define a subset (4461) of OMIM diseases that have at least 1 known monogenic causal gene. We then introduce MonoMiner, a natural language processing framework to identify molecularly confirmed monogenic patients from free-text clinical notes.

RESULTS

We show that ICD-10-CM codes cover only a fraction of monogenic diseases and that even where available, ICD-10-CM code‒based patient retrieval offers 0.14 precision. Searching by causal gene symbol offers great recall but has an even worse 0.07 precision. MonoMiner achieves 6 to 11 times higher precision (0.80), with 0.87 precision on disease diagnosis alone, tagging 4259 patients with 560 monogenic diseases and 534 causal genes, at 0.48 recall.

CONCLUSION

MonoMiner enables the discovery of a large, high-precision cohort of patients with monogenic diseases with an established molecular diagnosis, empowering numerous downstream uses. Because it relies solely on clinical notes, MonoMiner is highly portable, and its approach is adaptable to other domains and languages.

摘要

目的

队列构建是改善临床护理、开展生物医学研究、招募临床试验患者以及许多其他应用的有力基础。我们着手在一个拥有300万患者的医院系统中构建一个由所有已明确致病基因诊断的单基因疾病患者组成的队列。

方法

我们定义了OMIM疾病的一个子集(4461种),这些疾病至少有1个已知的单基因致病基因。然后我们引入了MonoMiner,这是一个自然语言处理框架,用于从自由文本临床记录中识别经分子确认的单基因疾病患者。

结果

我们表明,ICD-10-CM编码仅涵盖了一部分单基因疾病,而且即使在可用的情况下,基于ICD-10-CM编码的患者检索精度也仅为0.14。通过致病基因符号进行搜索召回率很高,但精度更差,仅为0.07。MonoMiner的精度提高了6至11倍(达到0.80),仅疾病诊断的精度就达到0.87,标记了4259名患有560种单基因疾病和534个致病基因的患者,召回率为0.48。

结论

MonoMiner能够发现大量经过分子诊断的高精度单基因疾病患者队列,为众多下游应用提供支持。由于它仅依赖临床记录,MonoMiner具有高度的可移植性,其方法也适用于其他领域和语言。

相似文献

1
Discovering monogenic patients with a confirmed molecular diagnosis in millions of clinical notes with MonoMiner.使用MonoMiner在数百万份临床记录中发现确诊分子诊断的单基因病患者。
Genet Med. 2022 Oct;24(10):2091-2102. doi: 10.1016/j.gim.2022.07.008. Epub 2022 Aug 17.
2
Challenges in clinical natural language processing for automated disorder normalization.临床自然语言处理中自动疾病标准化的挑战。
J Biomed Inform. 2015 Oct;57:28-37. doi: 10.1016/j.jbi.2015.07.010. Epub 2015 Jul 14.
3
Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records.利用电子健康记录纳入自然语言处理以改善轴性脊柱关节炎的分类。
Rheumatology (Oxford). 2020 May 1;59(5):1059-1065. doi: 10.1093/rheumatology/kez375.
4
Natural Language Processing Combined with ICD-9-CM Codes as a Novel Method to Study the Epidemiology of Allergic Drug Reactions.自然语言处理结合 ICD-9-CM 代码作为研究过敏性药物反应流行病学的新方法。
J Allergy Clin Immunol Pract. 2020 Mar;8(3):1032-1038.e1. doi: 10.1016/j.jaip.2019.12.007. Epub 2019 Dec 16.
5
Artificial Intelligence Learning Semantics via External Resources for Classifying Diagnosis Codes in Discharge Notes.人工智能通过外部资源学习语义以对出院小结中的诊断代码进行分类。
J Med Internet Res. 2017 Nov 6;19(11):e380. doi: 10.2196/jmir.8344.
6
Facilitating clinical research through automation: Combining optical character recognition with natural language processing.通过自动化促进临床研究:结合光学字符识别和自然语言处理。
Clin Trials. 2022 Oct;19(5):504-511. doi: 10.1177/17407745221093621. Epub 2022 May 24.
7
The use of natural language processing to identify vaccine-related anaphylaxis at five health care systems in the Vaccine Safety Datalink.利用自然语言处理技术在疫苗安全数据链中的五个医疗系统中识别与疫苗相关的过敏反应。
Pharmacoepidemiol Drug Saf. 2020 Feb;29(2):182-188. doi: 10.1002/pds.4919. Epub 2019 Dec 3.
8
Development of a generalizable natural language processing pipeline to extract physician-reported pain from clinical reports: Generated using publicly-available datasets and tested on institutional clinical reports for cancer patients with bone metastases.开发一种可推广的自然语言处理管道,从临床报告中提取医生报告的疼痛:使用公开可用的数据集生成,并在患有骨转移的癌症患者的机构临床报告上进行测试。
J Biomed Inform. 2021 Aug;120:103864. doi: 10.1016/j.jbi.2021.103864. Epub 2021 Jul 12.
9
Detecting the presence of an indwelling urinary catheter and urinary symptoms in hospitalized patients using natural language processing.使用自然语言处理技术检测住院患者体内留置导尿管的情况及泌尿系统症状。
J Biomed Inform. 2017 Jul;71S:S39-S45. doi: 10.1016/j.jbi.2016.07.012. Epub 2016 Jul 9.
10
Identifying Falls Risk Screenings Not Documented with Administrative Codes Using Natural Language Processing.使用自然语言处理识别未用行政代码记录的跌倒风险筛查。
AMIA Annu Symp Proc. 2018 Apr 16;2017:1923-1930. eCollection 2017.

引用本文的文献

1
Improving Clinical Documentation with Artificial Intelligence: A Systematic Review.利用人工智能改善临床文档记录:一项系统综述。
Perspect Health Inf Manag. 2024 Jun 1;21(2):1d. eCollection 2024 Summer-Fall.
2
Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review.将自然语言处理应用于临床数据仓库中的文本数据:系统评价。
JMIR Med Inform. 2023 Dec 15;11:e42477. doi: 10.2196/42477.
3
The Stanford Medicine data science ecosystem for clinical and translational research.用于临床和转化研究的斯坦福医学数据科学生态系统。
JAMIA Open. 2023 Aug 2;6(3):ooad054. doi: 10.1093/jamiaopen/ooad054. eCollection 2023 Oct.