• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用合成数据替代连锁推导元素:一个案例研究。

Using Synthetic Data to Replace Linkage Derived Elements: A Case Study.

作者信息

Resnick Dean M, Cox Christine S, Mirel Lisa B

机构信息

NORC at the University of Chicago, Bethesda, Maryland.

Centers for Disease Control and Prevention, National Center for Health Statistics, Hyattsville, Maryland.

出版信息

Health Serv Outcomes Res Methodol. 2021 Feb 3;21:389-406. doi: 10.1007/s10742-021-00241-z.

DOI:10.1007/s10742-021-00241-z
PMID:34737669
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8563018/
Abstract

While record linkage can expand analyses performable from survey microdata, it also incurs greater risk of privacy-encroaching disclosure. One way to mitigate this risk is to replace some of the information added through linkage with synthetic data elements. This paper describes a case study using the National Hospital Care Survey (NHCS), which collects patient records under a pledge of protecting patient privacy from a sample of U.S. hospitals for statistical analysis purposes. The NHCS data were linked to the National Death Index (NDI) to enhance the survey with mortality information. The added information from NDI linkage enables survival analyses related to hospitalization, but as the death information includes dates of death and detailed causes of death, having it joined with the patient records increases the risk of patient re-identification (albeit only for deceased persons). For this reason, an approach was tested to develop synthetic data that uses models from survival analysis to replace vital status and actual dates-of-death with synthetic values and uses classification tree analysis to replace actual causes of death with synthesized causes of death. The degree to which analyses performed on the synthetic data replicate results from analysis on the actual data is measured by comparing survival analysis parameter estimates from both data files. Because synthetic data only have value to the degree that they can be used to produce statistical estimates that are like those based on the actual data, this evaluation is an essential first step in assessing the potential utility of synthetic mortality data.

摘要

虽然记录链接可以扩展从调查微观数据中进行的分析,但它也带来了更大的侵犯隐私披露风险。减轻这种风险的一种方法是用合成数据元素替换通过链接添加的一些信息。本文描述了一个使用国家医院护理调查(NHCS)的案例研究,该调查在保护患者隐私的承诺下收集美国医院样本中的患者记录用于统计分析目的。NHCS数据与国家死亡索引(NDI)进行了链接,以用死亡率信息增强调查。来自NDI链接的附加信息使得能够进行与住院相关的生存分析,但由于死亡信息包括死亡日期和详细死因,将其与患者记录相结合会增加患者重新识别的风险(尽管仅针对死者)。出于这个原因,测试了一种方法来开发合成数据,该方法使用生存分析模型用合成值替换生命状态和实际死亡日期,并使用分类树分析用合成死因替换实际死因。通过比较两个数据文件的生存分析参数估计值来衡量对合成数据进行的分析复制实际数据分析结果的程度。由于合成数据只有在能够用于产生类似于基于实际数据的统计估计时才有价值,因此这种评估是评估合成死亡率数据潜在效用的重要第一步。

相似文献

1
Using Synthetic Data to Replace Linkage Derived Elements: A Case Study.使用合成数据替代连锁推导元素:一个案例研究。
Health Serv Outcomes Res Methodol. 2021 Feb 3;21:389-406. doi: 10.1007/s10742-021-00241-z.
2
Record matching between the National Hospital Care Survey and the National Death Index.国家医院护理调查与国家死亡索引之间的记录匹配。
Proc Am Stat Assoc. 2015 Aug;0:1-16. Epub 2015 Aug 11.
3
National Hospital Care Survey Demonstration Projects: Characteristics of Inpatient and Emergency Department Encounters Among Patients With Any Listed Diagnosis of Alzheimer Disease.国家医院护理调查示范项目:患有任何已列出诊断的阿尔茨海默病患者的住院和急诊科就诊情况特征
Natl Health Stat Report. 2018 Dec(121):1-9.
4
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
5
Mortality and Morbidity Effects of Long-Term Exposure to Low-Level PM, BC, NO, and O: An Analysis of European Cohorts in the ELAPSE Project.长期暴露于低水平 PM、BC、NO 和 O 对死亡率和发病率的影响:ELAPSE 项目中欧洲队列的分析。
Res Rep Health Eff Inst. 2021 Sep;2021(208):1-127.
6
Comparability of Mortality Estimates from Social Surveys and Vital Statistics Data in the United States.美国社会调查与人口动态统计数据中死亡率估计的可比性
Popul Res Policy Rev. 2019 Jun;38(3):371-401. doi: 10.1007/s11113-018-9505-1. Epub 2018 Dec 15.
7
Developing a Standardized and Reusable Method to Link Distributed Health Plan Databases to the National Death Index: Methods Development Study Protocol.开发一种将分布式健康计划数据库与国家死亡索引相链接的标准化可复用方法:方法开发研究方案
JMIR Res Protoc. 2020 Nov 2;9(11):e21811. doi: 10.2196/21811.
8
The impact of National Death Index linkages on population-based cancer survival rates in the United States.国家死亡索引链接对美国基于人群的癌症生存率的影响。
Cancer Epidemiol. 2013 Feb;37(1):20-8. doi: 10.1016/j.canep.2012.08.007. Epub 2012 Sep 7.
9
A methodological assessment of privacy preserving record linkage using survey and administrative data.使用调查数据和行政数据进行隐私保护记录链接的方法学评估。
Stat J IAOS. 2022 Jun 7;38(2):413-421. doi: 10.3233/sji-210891.
10
Opioid-involved Emergency Department Visits in the National Hospital Care Survey and the National Hospital Ambulatory Medical Care Survey.在国家医院护理调查和国家医院门诊医疗保健调查中涉及阿片类药物的急诊就诊情况。
Natl Health Stat Report. 2020 Dec(149):1-15.

引用本文的文献

1
Evaluating the utility of data integration with synthetic data and statistical matching.评估合成数据和统计匹配的数据集成效用。
Sci Rep. 2025 Sep 1;15(1):19627. doi: 10.1038/s41598-025-01514-0.

本文引用的文献

1
Sensitivity to censored-at-random assumption in the analysis of time-to-event endpoints.在事件发生时间终点分析中对随机删失假设的敏感性。
Pharm Stat. 2016 May;15(3):216-29. doi: 10.1002/pst.1738. Epub 2016 Mar 21.
2
Use of geocoding and surname analysis to estimate race and ethnicity.使用地理编码和姓氏分析来估计种族和族裔。
Health Serv Res. 2006 Aug;41(4 Pt 1):1482-500. doi: 10.1111/j.1475-6773.2006.00551.x.
3
New ICD-10 version of the Charlson comorbidity index predicted in-hospital mortality.新版国际疾病分类第十版(ICD - 10)的查尔森合并症指数可预测住院死亡率。
J Clin Epidemiol. 2004 Dec;57(12):1288-94. doi: 10.1016/j.jclinepi.2004.03.012.
4
A new method of classifying prognostic comorbidity in longitudinal studies: development and validation.纵向研究中预后合并症分类的一种新方法:开发与验证
J Chronic Dis. 1987;40(5):373-83. doi: 10.1016/0021-9681(87)90171-8.
5
The measurement of observer agreement for categorical data.分类数据观察者一致性的测量。
Biometrics. 1977 Mar;33(1):159-74.