Carelon Research, Wilmington, Delaware, USA.
Merck Healthcare KGaA, Darmstadt, Germany.
Pharmacoepidemiol Drug Saf. 2024 Aug;33(8):e5870. doi: 10.1002/pds.5870.
We investigated time trends in validation performance characteristics for six sources of death data available within the Healthcare Integrated Research Database (HIRD) over 8 years.
We conducted a secondary analysis of a cohort of advanced cancer patients with linked National Death Index (NDI) data identified in the HIRD between 2010 and 2018. We calculated sensitivity, specificity, positive predictive value, and negative predictive value for six sources of death status data and an algorithm combining data from available sources using NDI data as the reference standard. Measures were calculated for each year of the study including all members in the cohort for at least 1 day in that year.
We identified 27 396 deaths from any source among 40 692 cohort members. Between 2010 and 2018, the sensitivity of the Death Master File (DMF) decreased from 0.77 (95% CI = 0.76, 0.79) to 0.12 (95% CI = 0.11, 0.14). In contrast, the sensitivity of online obituary data increased from 0.43 (95% CI = 0.41, 0.45) in 2012 to 0.71 (95% CI = 0.68, 0.73) in 2018. The sensitivity of the composite algorithm remained above 0.83 throughout the study period. PPV was observed to be high from 2010 to 2016 and decrease thereafter for all sources. Specificity and NPV remained at high levels throughout the study.
We observed that the sensitivity of mortality data sources compared with the NDI could change substantially between 2010 and 2018. Other validation characteristics were less variable. Combining multiple sources of mortality data may be necessary to achieve adequate performance particularly for multiyear studies.
我们调查了在医疗综合研究数据库(HIRD)中可用的六种死亡数据来源在 8 年中的验证性能特征的时间趋势。
我们对 HIRD 中 2010 年至 2018 年期间确定的具有国家死亡索引(NDI)链接的晚期癌症患者队列进行了二次分析。我们使用 NDI 数据作为参考标准,为六种死亡状态数据来源和组合来自可用来源的数据的算法计算了敏感性、特异性、阳性预测值和阴性预测值。每年的研究中都计算了这些措施,包括该年至少有 1 天在队列中的所有成员。
我们在 40692 名队列成员中确定了 27396 例任何来源的死亡。2010 年至 2018 年间,死亡主文件(DMF)的敏感性从 0.77(95%置信区间=0.76,0.79)下降到 0.12(95%置信区间=0.11,0.14)。相比之下,在线讣告数据的敏感性从 2012 年的 0.43(95%置信区间=0.41,0.45)增加到 2018 年的 0.71(95%置信区间=0.68,0.73)。复合算法的敏感性在整个研究期间保持在 0.83 以上。所有来源的 PPV 从 2010 年到 2016 年都观察到很高,此后下降。特异性和 NPV 在整个研究期间保持在较高水平。
我们观察到,2010 年至 2018 年期间,与 NDI 相比,死亡率数据来源的敏感性可能会发生很大变化。其他验证特征的变化较小。为了实现足够的性能,特别是对于多年研究,可能需要组合多个死亡数据来源。