Suppr超能文献

TriNetX美国数据工厂:一个多用途、去标识化、联合式电子健康记录真实世界数据与分析网络概述及与美国人口普查的比较

TriNetX Dataworks-USA: Overview of a Multi-Purpose, De-Identified, Federated Electronic Health Record Real-World Data and Analytics Network and Comparison to the US Census.

作者信息

Stein Ellen, Hüser Matthias, Amirian E Susan, Palchuk Matvey B, Brown Jeffrey S

机构信息

TriNetX, LLC, Cambridge, Massachusetts, USA.

Harvard Medical School, Boston, Massachusetts, USA.

出版信息

Pharmacoepidemiol Drug Saf. 2025 Sep;34(9):e70198. doi: 10.1002/pds.70198.

Abstract

INTRODUCTION

Many clinical data networks often focus on a single use-case or disease. By contrast, the TriNetX Dataworks-USA Network contains real-world clinical information that can be applied to multiple research questions and use cases. The purpose of this study is to describe the Network's characteristics, as well as its generalizability to the US population, particularly the healthcare-seeking population.

METHODS

Using the Dataworks-USA Network, a large, regularly updated data network containing de-identified patient electronic health record (EHR) information from across the United States, basic demographics were summarized and compared to the US Census Bureau International Database (IDB) 2022 data and the National Cancer Institute's version of the Census Bureau's U.S. County Population Data for 2022 to examine the generalizability of the Network.

RESULTS

Patients in the Dataworks-USA Network are approximately 5 years older than the Census, and the Network has a larger proportion of female patients. The Network has a lower proportion of patients identified as Asian and White race, and a higher proportion who identify as other relative to the Census; other races are similar between the two data sources (< 1% difference). Regionally, Dataworks-USA has a smaller proportion of patients in all race categories compared with the Census due to the larger proportion of patients of Unknown or Other race.

CONCLUSIONS

TriNetX's Dataworks-USA Network provides a robust data source for many use cases and is broadly generalizable to the US population, particularly the healthcare-seeking population, with differences related to the underlying nature of the data sources.

摘要

引言

许多临床数据网络通常专注于单一用例或疾病。相比之下,TriNetX美国数据工厂网络包含可应用于多个研究问题和用例的真实世界临床信息。本研究的目的是描述该网络的特征,以及其对美国人群,特别是寻求医疗服务人群的可推广性。

方法

使用美国数据工厂网络,这是一个大型的、定期更新的数据网络,包含来自美国各地的去识别化患者电子健康记录(EHR)信息,总结基本人口统计数据,并与美国人口普查局国际数据库(IDB)2022年数据以及美国国家癌症研究所版本的2022年人口普查局美国县人口数据进行比较,以检验该网络的可推广性。

结果

美国数据工厂网络中的患者比人口普查中的患者大约大5岁,且该网络中女性患者的比例更高。与人口普查相比,该网络中被认定为亚洲和白人种族的患者比例较低,而被认定为其他种族的患者比例较高;两个数据源中其他种族的比例相似(差异<1%)。在区域方面,由于未知或其他种族患者的比例较大,美国数据工厂网络中所有种族类别的患者比例均低于人口普查。

结论

TriNetX的美国数据工厂网络为许多用例提供了强大的数据源,并且在很大程度上可推广到美国人群,特别是寻求医疗服务的人群,但存在与数据源的潜在性质相关的差异。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验