New York City Department of Health and Mental Hygiene, Bureau of HIV/AIDS Prevention and Control, Long Island City, NY.
New York City Department of Health and Mental Hygiene, Bureau of STD Control, Long Island City, NY.
Public Health Rep. 2014 Jan-Feb;129 Suppl 1(Suppl 1):79-84. doi: 10.1177/00333549141291S112.
Statistical modeling has suggested that the prevalence of false matches in data matching declines as the events become rarer or the number of matches increases. We examined the effect of case rate and coinfection rate in the population on the positive predictive value (PPV) of a matching algorithm for HIV/AIDS and sexually transmitted disease (STD) surveillance registry data.
We used LinkPlus™, a probabilistic data-matching program, to match HIV/AIDS cases diagnosed in New York City (NYC) from 1981 to March 31, 2012, and reported to the NYC HIV/AIDS surveillance registry against syphilis and chlamydia cases diagnosed in NYC from January 1 to June 30, 2010, and reported to the NYC STD registry. Match results were manually reviewed to determine true matches.
With an agreement/disagreement comparison score cutoff value of 10.0, LinkPlus identified 3,013 matches, of which 1,582 were determined to be true by manual review. PPV varied greatly in subpopulations with different case rates and coinfection rates. PPV was the highest (91.6%) in male syphilis cases, who had a relatively low case rate but a high HIV coinfection rate, and lowest (18.0%) in female chlamydia cases, who had a high case rate but a low HIV coinfection rate. When the cutoff value was increased to 15.0, PPVs in male syphilis and female chlamydia cases increased to 98.3% and 90.5%, respectively.
Case rates and coinfection rates have a significant effect on the PPV of a registry data-matching algorithm: PPV decreases as the case rate increases and coinfection rate decreases. Before conducting registry data matching, program staff should assess the case rate and coinfection rate of the population included in the data matching and select an appropriate matching algorithm.
统计建模表明,随着事件变得更加罕见或匹配数量的增加,数据匹配中的错误匹配的发生率会降低。我们研究了人群中的病例率和合并感染率对 HIV/AIDS 和性传播疾病(STD)监测登记数据匹配算法阳性预测值(PPV)的影响。
我们使用 LinkPlus™,一种概率数据匹配程序,将 1981 年至 2012 年 3 月 31 日期间在纽约市(NYC)诊断的 HIV/AIDS 病例与 2010 年 1 月 1 日至 6 月 30 日期间在 NYC 诊断并报告给 NYC HIV/AIDS 监测登记处的梅毒和衣原体病例进行匹配,并报告给 NYC STD 登记处。匹配结果经过人工审查以确定真实匹配。
使用一致性/不一致性比较评分截断值为 10.0,LinkPlus 确定了 3013 次匹配,其中 1582 次通过人工审查确定为真实匹配。在病例率和合并感染率不同的亚人群中,PPV 差异很大。PPV 在病例率相对较低但 HIV 合并感染率较高的男性梅毒病例中最高(91.6%),而在病例率较高但 HIV 合并感染率较低的女性衣原体病例中最低(18.0%)。当截断值增加到 15.0 时,男性梅毒和女性衣原体病例的 PPV 分别增加到 98.3%和 90.5%。
病例率和合并感染率对登记数据匹配算法的 PPV 有显著影响:PPV 随着病例率的增加和合并感染率的降低而降低。在进行登记数据匹配之前,项目工作人员应评估数据匹配中包含的人群的病例率和合并感染率,并选择适当的匹配算法。