使用 SNOMED CT 对历史表型算法进行翻译和评估。

Translating and evaluating historic phenotyping algorithms using SNOMED CT.

机构信息

Institute of Health Informatics, University College London, London, UK.

Health Data Research UK, London, UK.

出版信息

J Am Med Inform Assoc. 2023 Jan 18;30(2):222-232. doi: 10.1093/jamia/ocac158.

DOI:10.1093/jamia/ocac158

PMID:36083213

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9846670/

Abstract

OBJECTIVE

Patient phenotype definitions based on terminologies are required for the computational use of electronic health records. Within UK primary care research databases, such definitions have typically been represented as flat lists of Read terms, but Systematized Nomenclature of Medicine-Clinical Terms (SNOMED CT) (a widely employed international reference terminology) enables the use of relationships between concepts, which could facilitate the phenotyping process. We implemented SNOMED CT-based phenotyping approaches and investigated their performance in the CPRD Aurum primary care database.

MATERIALS AND METHODS

We developed SNOMED CT phenotype definitions for 3 exemplar diseases: diabetes mellitus, asthma, and heart failure, using 3 methods: "primary" (primary concept and its descendants), "extended" (primary concept, descendants, and additional relations), and "value set" (based on text searches of term descriptions). We also derived SNOMED CT codelists in a semiautomated manner for 276 disease phenotypes used in a study of health across the lifecourse. Cohorts selected using each codelist were compared to "gold standard" manually curated Read codelists in a sample of 500 000 patients from CPRD Aurum.

RESULTS

SNOMED CT codelists selected a similar set of patients to Read, with F1 scores exceeding 0.93, and age and sex distributions were similar. The "value set" and "extended" codelists had slightly greater recall but lower precision than "primary" codelists. We were able to represent 257 of the 276 phenotypes by a single concept hierarchy, and for 135 phenotypes, the F1 score was greater than 0.9.

CONCLUSIONS

SNOMED CT provides an efficient way to define disease phenotypes, resulting in similar patient populations to manually curated codelists.

摘要

目的

基于术语的患者表型定义对于电子健康记录的计算使用是必需的。在英国初级保健研究数据库中，此类定义通常表示为 Read 术语的平面列表，但系统命名法医学临床术语（SNOMED CT）（一种广泛使用的国际参考术语）允许使用概念之间的关系，这可以促进表型过程。我们实施了基于 SNOMED CT 的表型方法，并在 CPRD Aurum 初级保健数据库中研究了它们的性能。

材料和方法

我们使用 3 种方法为 3 个范例疾病（糖尿病、哮喘和心力衰竭）开发了基于 SNOMED CT 的表型定义：“主要”（主要概念及其后代）、“扩展”（主要概念、后代和其他关系）和“值集”（基于术语描述的文本搜索）。我们还以半自动方式为在生命过程中跨健康研究中使用的 276 种疾病表型派生了 SNOMED CT 编码列表。使用每个编码列表选择的队列与 CPRD Aurum 中 500 000 名患者的“黄金标准”手动整理的 Read 编码列表进行了比较。

结果

SNOMED CT 编码列表选择了与 Read 相似的患者集，F1 分数超过 0.93，年龄和性别分布相似。“值集”和“扩展”编码列表的召回率略高，但精度低于“主要”编码列表。我们能够用单个概念层次结构表示 276 种表型中的 257 种，对于 135 种表型，F1 分数大于 0.9。

结论

SNOMED CT 提供了一种定义疾病表型的有效方法，导致与手动整理的编码列表相似的患者群体。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/67d4/9846670/383d1bdf73af/ocac158f1.jpg

相似文献

Translating and evaluating historic phenotyping algorithms using SNOMED CT.使用 SNOMED CT 对历史表型算法进行翻译和评估。

J Am Med Inform Assoc. 2023 Jan 18;30(2):222-232. doi: 10.1093/jamia/ocac158.

Review of codelists used to define hypertension in electronic health records and development of a codelist for research.电子健康记录中用于定义高血压的编码列表的回顾以及用于研究的编码列表的开发。

Open Heart. 2024 Apr 15;11(1):e002640. doi: 10.1136/openhrt-2024-002640.

An alternative database approach for management of SNOMED CT and improved patient data queries.一种用于管理医学系统命名法临床术语（SNOMED CT）及改进患者数据查询的替代数据库方法。

J Biomed Inform. 2015 Oct;57:350-7. doi: 10.1016/j.jbi.2015.08.016. Epub 2015 Aug 21.

SNOMED CT Concept Hierarchies for Sharing Definitions of Clinical Conditions Using Electronic Health Record Data.使用电子健康记录数据共享临床病症定义的SNOMED CT概念层次结构。

Appl Clin Inform. 2018 Jul;9(3):667-682. doi: 10.1055/s-0038-1668090. Epub 2018 Aug 29.

Use of SNOMED CT® and LOINC® to standardize terminology for primary care asthma electronic health records.使用SNOMED CT®和LOINC®对基层医疗哮喘电子健康记录的术语进行标准化。

J Asthma. 2018 Jun;55(6):629-639. doi: 10.1080/02770903.2017.1362424. Epub 2017 Oct 9.

Definition and validation of SNOMED CT subsets using the expression constraint language.使用表达式约束语言定义和验证 SNOMED CT 子集。

J Biomed Inform. 2021 May;117:103747. doi: 10.1016/j.jbi.2021.103747. Epub 2021 Mar 19.

A comparative analysis of the density of the SNOMED CT conceptual content for semantic harmonization.用于语义协调的SNOMED CT概念内容密度的比较分析。

Artif Intell Med. 2015 May;64(1):29-40. doi: 10.1016/j.artmed.2015.03.002. Epub 2015 Apr 2.

Gap Analysis of Glaucoma Examination Concept Representations within Standard Systemized Nomenclature of Medicine - Clinical Terms.医学标准系统命名法 - 临床术语中青光眼检查概念表示的差距分析

Ophthalmol Glaucoma. 2025 Jan-Feb;8(1):83-91. doi: 10.1016/j.ogla.2024.08.001. Epub 2024 Aug 13.

SNOMED CT Concept Hierarchies for Computable Clinical Phenotypes From Electronic Health Record Data: Comparison of Intensional Versus Extensional Value Sets.用于从电子健康记录数据中获取可计算临床表型的SNOMED CT概念层次结构：内涵值集与外延值集的比较

JMIR Med Inform. 2019 Jan 16;7(1):e11487. doi: 10.2196/11487.

SNOMED CT in a language isolate: an algorithm for a semiautomatic translation.一种孤立语言中的医学系统命名法（SNOMED CT）：一种半自动翻译算法

BMC Med Inform Decis Mak. 2015;15 Suppl 2(Suppl 2):S5. doi: 10.1186/1472-6947-15-S2-S5. Epub 2015 Jun 15.

引用本文的文献

Predicting incident dementia in community-dwelling older adults using primary and secondary care data from electronic health records.利用电子健康记录中的初级和二级医疗数据预测社区居住老年人的新发痴呆症。

Brain Commun. 2024 Dec 24;7(1):fcae469. doi: 10.1093/braincomms/fcae469. eCollection 2025.

Checklist and guidance on creating codelists for routinely collected health data research.常规收集的健康数据研究编码列表创建清单及指南

NIHR Open Res. 2024 Sep 18;4:20. doi: 10.3310/nihropenres.13550.2. eCollection 2024.

UK Electronic Healthcare Records for Research: A Scientometric Analysis of Respiratory, Cardiovascular, and COVID-19 Publications.英国用于研究的电子医疗记录：呼吸、心血管及新冠病毒疾病出版物的科学计量分析

Pragmat Obs Res. 2024 Aug 15;15:151-164. doi: 10.2147/POR.S469973. eCollection 2024.

Comparing natural language processing representations of coded disease sequences for prediction in electronic health records.比较编码疾病序列的自然语言处理表示，以用于电子健康记录中的预测。

J Am Med Inform Assoc. 2024 Jun 20;31(7):1451-1462. doi: 10.1093/jamia/ocae091.

Geographical and practical challenges in the implementation of digital health passports for cross-border COVID-19 pandemic management: a narrative review and framework for solutions.地理和实际挑战在数字健康护照的实施跨境 COVID-19 大流行管理:叙事审查和解决方案的框架。

Global Health. 2023 Dec 8;19(1):98. doi: 10.1186/s12992-023-00998-7.

Long Covid symptoms and diagnosis in primary care: A cohort study using structured and unstructured data in The Health Improvement Network primary care database.长新冠症状和初级保健诊断：使用健康改善网络初级保健数据库中的结构化和非结构化数据的队列研究。

PLoS One. 2023 Sep 26;18(9):e0290583. doi: 10.1371/journal.pone.0290583. eCollection 2023.

Determining prescriptions in electronic healthcare record data: methods for development of standardized, reproducible drug codelists.在电子健康记录数据中确定处方：标准化、可重复的药品代码列表的开发方法。

JAMIA Open. 2023 Aug 29;6(3):ooad078. doi: 10.1093/jamiaopen/ooad078. eCollection 2023 Oct.

Advancing phenotyping through informatics innovation.通过信息学创新推进表型分析。

J Am Med Inform Assoc. 2023 Jan 18;30(2):211-212. doi: 10.1093/jamia/ocac247.

本文引用的文献

Definition and validation of SNOMED CT subsets using the expression constraint language.使用表达式约束语言定义和验证 SNOMED CT 子集。

J Biomed Inform. 2021 May;117:103747. doi: 10.1016/j.jbi.2021.103747. Epub 2021 Mar 19.

A semi-supervised approach for rapidly creating clinical biomarker phenotypes in the UK Biobank using different primary care EHR and clinical terminology systems.一种使用不同基层医疗电子健康记录和临床术语系统在英国生物银行中快速创建临床生物标志物表型的半监督方法。

JAMIA Open. 2020 Dec 5;3(4):545-556. doi: 10.1093/jamiaopen/ooaa047. eCollection 2020 Dec.

A chronological map of 308 physical and mental health conditions from 4 million individuals in the English National Health Service.308 种身心状况的时间图谱，源自英国国民保健署 400 万人的数据。

Lancet Digit Health. 2019 May 20;1(2):e63-e77. doi: 10.1016/S2589-7500(19)30012-3. eCollection 2019 Jun.

Making work visible for electronic phenotype implementation: Lessons learned from the eMERGE network.实现电子表型的工作可视化：从 eMERGE 网络中获得的经验教训。

J Biomed Inform. 2019 Nov;99:103293. doi: 10.1016/j.jbi.2019.103293. Epub 2019 Sep 19.

UK phenomics platform for developing and validating electronic health record phenotypes: CALIBER.英国表型组学平台用于开发和验证电子健康记录表型：CALIBER。

J Am Med Inform Assoc. 2019 Dec 1;26(12):1545-1559. doi: 10.1093/jamia/ocz105.

Facilitating phenotype transfer using a common data model.利用通用数据模型促进表型转移。

J Biomed Inform. 2019 Aug;96:103253. doi: 10.1016/j.jbi.2019.103253. Epub 2019 Jul 17.

Data resource profile: Clinical Practice Research Datalink (CPRD) Aurum.数据资源简介：临床实践研究数据链（CPRD）奥鲁姆

Int J Epidemiol. 2019 Dec 1;48(6):1740-1740g. doi: 10.1093/ije/dyz034.

JMIR Med Inform. 2019 Jan 16;7(1):e11487. doi: 10.2196/11487.

SNOMED CT Concept Hierarchies for Sharing Definitions of Clinical Conditions Using Electronic Health Record Data.使用电子健康记录数据共享临床病症定义的SNOMED CT概念层次结构。

Appl Clin Inform. 2018 Jul;9(3):667-682. doi: 10.1055/s-0038-1668090. Epub 2018 Aug 29.

Mismatches between major subhierarchies and semantic tags in SNOMED CT.SNOMED CT 中的主要子层次结构和语义标签之间的不匹配。

J Biomed Inform. 2018 May;81:1-15. doi: 10.1016/j.jbi.2018.02.009. Epub 2018 Feb 17.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用 SNOMED CT 对历史表型算法进行翻译和评估。

Translating and evaluating historic phenotyping algorithms using SNOMED CT.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

CONCLUSIONS

目的

材料和方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献