Suppr超能文献

创建下一代表型库:英国健康数据研究表型库

Creating a next-generation phenotype library: the health data research UK Phenotype Library.

作者信息

Thayer Daniel S, Mumtaz Shahzad, Elmessary Muhammad A, Scanlon Ieuan, Zinnurov Artur, Coldea Alex-Ioan, Scanlon Jack, Chapman Martin, Curcin Vasa, John Ann, DelPozo-Banos Marcos, Davies Hannah, Karwath Andreas, Gkoutos Georgios V, Fitzpatrick Natalie K, Quint Jennifer K, Varma Susheel, Milner Chris, Oliveira Carla, Parkinson Helen, Denaxas Spiros, Hemingway Harry, Jefferson Emily

机构信息

SAIL Databank, Medical School, Swansea University, Swansea, SA2 8PP, United Kingdom.

Health Informatics Centre, School of Medicine, University of Dundee, Dundee, DD1 9SY, United Kingdom.

出版信息

JAMIA Open. 2024 Jun 17;7(2):ooae049. doi: 10.1093/jamiaopen/ooae049. eCollection 2024 Jul.

Abstract

OBJECTIVE

To enable reproducible research at scale by creating a platform that enables health data users to find, access, curate, and re-use electronic health record phenotyping algorithms.

MATERIALS AND METHODS

We undertook a structured approach to identifying requirements for a phenotype algorithm platform by engaging with key stakeholders. User experience analysis was used to inform the design, which we implemented as a web application featuring a novel metadata standard for defining phenotyping algorithms, access via Application Programming Interface (API), support for computable data flows, and version control. The application has creation and editing functionality, enabling researchers to submit phenotypes directly.

RESULTS

We created and launched the Phenotype Library in October 2021. The platform currently hosts 1049 phenotype definitions defined against 40 health data sources and >200K terms across 16 medical ontologies. We present several case studies demonstrating its utility for supporting and enabling research: the library hosts curated phenotype collections for the BREATHE respiratory health research hub and the Adolescent Mental Health Data Platform, and it is supporting the development of an informatics tool to generate clinical evidence for clinical guideline development groups.

DISCUSSION

This platform makes an impact by being open to all health data users and accepting all appropriate content, as well as implementing key features that have not been widely available, including managing structured metadata, access via an API, and support for computable phenotypes.

CONCLUSIONS

We have created the first openly available, programmatically accessible resource enabling the global health research community to store and manage phenotyping algorithms. Removing barriers to describing, sharing, and computing phenotypes will help unleash the potential benefit of health data for patients and the public.

摘要

目的

通过创建一个平台,使健康数据用户能够查找、访问、整理和重新使用电子健康记录表型分析算法,从而实现大规模的可重复研究。

材料与方法

我们采用结构化方法,通过与关键利益相关者合作来确定表型算法平台的需求。用户体验分析为设计提供了参考,我们将其实现为一个网络应用程序,该程序具有用于定义表型分析算法的新颖元数据标准、通过应用程序编程接口(API)进行访问、对可计算数据流的支持以及版本控制。该应用程序具有创建和编辑功能,使研究人员能够直接提交表型。

结果

我们于2021年10月创建并推出了表型库。该平台目前托管着针对40个健康数据源定义的1049个表型定义,以及跨越16个医学本体的超过20万个术语。我们展示了几个案例研究,证明了其在支持和推动研究方面的效用:该库托管了为BREATHE呼吸健康研究中心和青少年心理健康数据平台精心策划的表型集合,并且正在支持开发一种信息学工具,以为临床指南制定小组生成临床证据。

讨论

该平台通过向所有健康数据用户开放并接受所有合适的内容,以及实施尚未广泛可用的关键功能(包括管理结构化元数据、通过API访问和支持可计算表型)而产生影响。

结论

我们创建了首个公开可用、可通过编程访问的资源,使全球健康研究界能够存储和管理表型分析算法。消除描述、共享和计算表型的障碍将有助于释放健康数据对患者和公众的潜在益处。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2713/11182945/5e005f0901da/ooae049f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验