联邦查询临床数据存储库：部分之和不等于整体。

Federated queries of clinical data repositories: the sum of the parts does not equal the whole.

机构信息

Information Technology, Harvard Medical School, Boston, Massachusetts 02115, USA.

出版信息

J Am Med Inform Assoc. 2013 Jun;20(e1):e155-61. doi: 10.1136/amiajnl-2012-001299. Epub 2013 Jan 24.

DOI:10.1136/amiajnl-2012-001299

PMID:23349080

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3715334/

Abstract

BACKGROUND AND OBJECTIVE

In 2008 we developed a shared health research information network (SHRINE), which for the first time enabled research queries across the full patient populations of four Boston hospitals. It uses a federated architecture, where each hospital returns only the aggregate count of the number of patients who match a query. This allows hospitals to retain control over their local databases and comply with federal and state privacy laws. However, because patients may receive care from multiple hospitals, the result of a federated query might differ from what the result would be if the query were run against a single central repository. This paper describes the situations when this happens and presents a technique for correcting these errors.

METHODS

We use a one-time process of identifying which patients have data in multiple repositories by comparing one-way hash values of patient demographics. This enables us to partition the local databases such that all patients within a given partition have data at the same subset of hospitals. Federated queries are then run separately on each partition independently, and the combined results are presented to the user.

RESULTS

Using theoretical bounds and simulated hospital networks, we demonstrate that once the partitions are made, SHRINE can produce more precise estimates of the number of patients matching a query.

CONCLUSIONS

Uncertainty in the overlap of patient populations across hospitals limits the effectiveness of SHRINE and other federated query tools. Our technique reduces this uncertainty while retaining an aggregate federated architecture.

摘要

背景与目的

2008 年，我们开发了一个共享健康研究信息网络（SHRINE），这是首次使研究查询能够跨越四个波士顿医院的全部患者群体。它使用联邦架构，每个医院仅返回与查询匹配的患者数量的汇总计数。这允许医院保留对其本地数据库的控制，并遵守联邦和州的隐私法。然而，由于患者可能在多个医院接受治疗，联邦查询的结果可能与针对单个中央存储库运行查询的结果不同。本文描述了这种情况发生的情况，并提出了一种纠正这些错误的技术。

方法

我们通过比较患者人口统计学的单向哈希值来一次性识别具有多个存储库数据的患者。这使我们能够对本地数据库进行分区，以便给定分区内的所有患者在同一组医院都有数据。然后，我们分别在每个分区上独立运行联邦查询，并将组合结果呈现给用户。

结果

使用理论界限和模拟医院网络，我们证明一旦进行分区，SHRINE 就可以更准确地估计匹配查询的患者数量。

结论

医院之间患者群体的重叠不确定性限制了 SHRINE 和其他联邦查询工具的有效性。我们的技术在保留聚合联邦架构的同时降低了这种不确定性。

相似文献

Federated queries of clinical data repositories: the sum of the parts does not equal the whole.

J Am Med Inform Assoc. 2013 Jun;20(e1):e155-61. doi: 10.1136/amiajnl-2012-001299. Epub 2013 Jan 24.

Balancing Accuracy and Privacy in Federated Queries of Clinical Data Repositories: Algorithm Development and Validation.

J Med Internet Res. 2020 Nov 3;22(11):e18735. doi: 10.2196/18735.

The Shared Health Research Information Network (SHRINE): a prototype federated query tool for clinical data repositories.

J Am Med Inform Assoc. 2009 Sep-Oct;16(5):624-30. doi: 10.1197/jamia.M3191. Epub 2009 Jun 30.

Federated queries for comparative effectiveness research: performance analysis.

Stud Health Technol Inform. 2012;175:9-18.

Federated queries of clinical data repositories: Scaling to a national network.

J Biomed Inform. 2015 Jun;55:231-6. doi: 10.1016/j.jbi.2015.04.012. Epub 2015 May 6.

Towards cross-application model-agnostic federated cohort discovery.

J Am Med Inform Assoc. 2024 Oct 1;31(10):2202-2209. doi: 10.1093/jamia/ocae211.

A system to build distributed multivariate models and manage disparate data sharing policies: implementation in the scalable national network for effectiveness research.

J Am Med Inform Assoc. 2015 Nov;22(6):1187-95. doi: 10.1093/jamia/ocv017. Epub 2015 Jul 3.

External phenome analysis enables a rational federated query strategy to detect changing rates of treatment-related complications associated with multiple myeloma.

J Am Med Inform Assoc. 2013 Jul-Aug;20(4):696-9. doi: 10.1136/amiajnl-2012-001355. Epub 2013 Mar 20.

Bridging Data Models in Health Care With a Novel Intermediate Query Format for Feasibility Queries: Mixed Methods Study.

JMIR Med Inform. 2024 Oct 14;12:e58541. doi: 10.2196/58541.

A distributed, scalable, community care network architecture for wide-area electronic patient records: modeling and simulation.

Proc Annu Symp Comput Appl Med Care. 1995:352-6.

引用本文的文献

Privacy-preserving data sharing infrastructures for medical research: systematization and comparison.

BMC Med Inform Decis Mak. 2021 Aug 12;21(1):242. doi: 10.1186/s12911-021-01602-x.

Expected 10-anonymity of HyperLogLog sketches for federated queries of clinical data repositories.

Bioinformatics. 2021 Jul 12;37(Suppl_1):i151-i160. doi: 10.1093/bioinformatics/btab292.

Balancing Accuracy and Privacy in Federated Queries of Clinical Data Repositories: Algorithm Development and Validation.

J Med Internet Res. 2020 Nov 3;22(11):e18735. doi: 10.2196/18735.

Fold-stratified cross-validation for unbiased and privacy-preserving federated learning.

J Am Med Inform Assoc. 2020 Aug 1;27(8):1244-1251. doi: 10.1093/jamia/ocaa096.

Secure and scalable deduplication of horizontally partitioned health data for privacy-preserving distributed statistical computation.

BMC Med Inform Decis Mak. 2017 Jan 3;17(1):1. doi: 10.1186/s12911-016-0389-x.

Clinical Decision Support: a 25 Year Retrospective and a 25 Year Vision.

Yearb Med Inform. 2016 Aug 2;Suppl 1(Suppl 1):S103-16. doi: 10.15265/IYS-2016-s034.

Absence of evidence for increase in risk for autism or attention-deficit hyperactivity disorder following antidepressant exposure during pregnancy: a replication study.

Transl Psychiatry. 2016 Jan 5;6(1):e708. doi: 10.1038/tp.2015.190.

Federated queries of clinical data repositories: Scaling to a national network.

J Biomed Inform. 2015 Jun;55:231-6. doi: 10.1016/j.jbi.2015.04.012. Epub 2015 May 6.

Securely measuring the overlap between private datasets with cryptosets.

PLoS One. 2015 Feb 25;10(2):e0117898. doi: 10.1371/journal.pone.0117898. eCollection 2015.

Changing the research landscape: the New York City Clinical Data Research Network.

J Am Med Inform Assoc. 2014 Jul-Aug;21(4):587-90. doi: 10.1136/amiajnl-2014-002764. Epub 2014 May 12.

本文引用的文献

Escaping the EHR trap--the future of health IT.

N Engl J Med. 2012 Jun 14;366(24):2240-2. doi: 10.1056/NEJMp1203102.

Cardiac angiogenic imbalance leads to peripartum cardiomyopathy.

Nature. 2012 May 9;485(7398):333-8. doi: 10.1038/nature11040.

The co-morbidity burden of children and young adults with autism spectrum disorders.

PLoS One. 2012;7(4):e33224. doi: 10.1371/journal.pone.0033224. Epub 2012 Apr 12.

Implementation of a deidentified federated data network for population-based cohort discovery.

J Am Med Inform Assoc. 2012 Jun;19(e1):e60-7. doi: 10.1136/amiajnl-2011-000133. Epub 2011 Aug 26.

Results from simulated data sets: probabilistic record linkage outperforms deterministic record linkage.

J Clin Epidemiol. 2011 May;64(5):565-72. doi: 10.1016/j.jclinepi.2010.05.008. Epub 2010 Oct 16.

Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2).

J Am Med Inform Assoc. 2010 Mar-Apr;17(2):124-30. doi: 10.1136/jamia.2009.000893.

The Shared Health Research Information Network (SHRINE): a prototype federated query tool for clinical data repositories.

J Am Med Inform Assoc. 2009 Sep-Oct;16(5):624-30. doi: 10.1197/jamia.M3191. Epub 2009 Jun 30.

A national human neuroimaging collaboratory enabled by the Biomedical Informatics Research Network (BIRN).

IEEE Trans Inf Technol Biomed. 2008 Mar;12(2):162-72. doi: 10.1109/TITB.2008.917893.

caGrid 1.0: an enterprise Grid infrastructure for biomedical research.

J Am Med Inform Assoc. 2008 Mar-Apr;15(2):138-49. doi: 10.1197/jamia.M2522. Epub 2007 Dec 20.

A system for sharing routine surgical pathology specimens across institutions: the Shared Pathology Informatics Network.

Hum Pathol. 2007 Aug;38(8):1212-25. doi: 10.1016/j.humpath.2007.01.007. Epub 2007 May 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

联邦查询临床数据存储库：部分之和不等于整体。

Federated queries of clinical data repositories: the sum of the parts does not equal the whole.

机构信息

Information Technology, Harvard Medical School, Boston, Massachusetts 02115, USA.

出版信息

J Am Med Inform Assoc. 2013 Jun;20(e1):e155-61. doi: 10.1136/amiajnl-2012-001299. Epub 2013 Jan 24.

DOI:10.1136/amiajnl-2012-001299

PMID:23349080

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3715334/

Abstract

BACKGROUND AND OBJECTIVE

METHODS

RESULTS

Using theoretical bounds and simulated hospital networks, we demonstrate that once the partitions are made, SHRINE can produce more precise estimates of the number of patients matching a query.

CONCLUSIONS

摘要

背景与目的

方法

结果

使用理论界限和模拟医院网络，我们证明一旦进行分区，SHRINE 就可以更准确地估计匹配查询的患者数量。

结论

医院之间患者群体的重叠不确定性限制了 SHRINE 和其他联邦查询工具的有效性。我们的技术在保留聚合联邦架构的同时降低了这种不确定性。

联邦查询临床数据存储库：部分之和不等于整体。

Federated queries of clinical data repositories: the sum of the parts does not equal the whole.

机构信息

出版信息

BACKGROUND AND OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景与目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

联邦查询临床数据存储库：部分之和不等于整体。

Federated queries of clinical data repositories: the sum of the parts does not equal the whole.

机构信息

出版信息

BACKGROUND AND OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景与目的

方法

结果

结论