Desai Rishi J, Solomon Daniel H, Shadick Nancy, Iannaccone Christine, Kim Seoyoung C
Division of Pharmacoepidemiology and Pharmacoeconomics, Department of Medicine, Brigham and Women's Hospital & Harvard Medical School, Boston, MA, USA.
Division of Rheumatology, Immunology and Allergy, Brigham and Women's Hospital & Harvard Medical School, Boston, MA, USA.
Pharmacoepidemiol Drug Saf. 2016 Apr;25(4):472-5. doi: 10.1002/pds.3953. Epub 2016 Jan 13.
This study examined the accuracy of claims-based algorithms to identify smoking against self-reported smoking data.
Medicare patients enrolled in the Brigham and Women's Hospital Rheumatoid Arthritis Sequential Study were identified. For each patient, self-reported smoking status was extracted from Women's Hospital Rheumatoid Arthritis Sequential Study and the date of this measurement was defined as the index-date. Two algorithms identified smoking in Medicare claims: (i) only using diagnoses and procedure codes and (ii) using anti-smoking prescriptions in addition to diagnoses and procedure codes. Both algorithms were implemented: first, only using 365-days pre-index claims and then using all available pre-index claims. Considering self-reported smoking status as the gold standard, we calculated specificity, sensitivity, positive predictive value, negative predictive value (NPV), and area under the curve (AUC).
A total of 128 patients were included in this study, of which 48% reported smoking. The algorithm only using diagnosis and procedure codes had the lowest sensitivity (9.8%, 95%CI 2.4%-17.3%), NPV (54.9%, 95%CI 46.1%-63.9%), and AUC (0.55, 95%CI 0.51-0.59) when applied in the period of 365 days pre-index. Incorporating pharmacy claims and using all available pre-index information improved the sensitivity (27.9%, 95%CI 16.6%-39.1%), NPV (60.4%, 95%CI 51.3%-69.5%), and AUC (0.64, 95%CI 0.58-0.70). The specificity and positive predictive value was 100% for all the algorithms tested.
Claims-based algorithms can identify smokers with limited sensitivity but very high specificity. In the absence of other reliable means, use of a claims-based algorithm to identify smoking could be cautiously considered in observational studies.
本研究检验了基于索赔数据的算法识别吸烟情况相对于自我报告吸烟数据的准确性。
确定了参加布莱根妇女医院类风湿关节炎序贯研究的医疗保险患者。对于每位患者,从妇女医院类风湿关节炎序贯研究中提取自我报告的吸烟状况,并将该测量日期定义为索引日期。两种算法用于在医疗保险索赔中识别吸烟情况:(i)仅使用诊断和程序代码,(ii)除诊断和程序代码外还使用戒烟处方。两种算法均实施:首先,仅使用索引日期前365天的索赔,然后使用所有可用的索引日期前索赔。将自我报告的吸烟状况视为金标准,我们计算了特异性、敏感性、阳性预测值、阴性预测值(NPV)和曲线下面积(AUC)。
本研究共纳入128例患者,其中48%报告吸烟。仅使用诊断和程序代码的算法在索引日期前365天期间应用时,敏感性最低(9.8%,95%CI 2.4%-17.3%)、NPV最低(54.9%,95%CI 46.1%-63.9%)和AUC最低(0.55,95%CI 0.51-0.59)。纳入药房索赔并使用所有可用的索引日期前信息可提高敏感性(27.9%,95%CI 16.6%-39.1%)、NPV(60.4%,95%CI 51.3%-69.5%)和AUC(0.64,95%CI 0.58-0.70)。所有测试算法的特异性和阳性预测值均为100%。
基于索赔数据的算法识别吸烟者的敏感性有限,但特异性非常高。在没有其他可靠方法的情况下,在观察性研究中可谨慎考虑使用基于索赔数据的算法来识别吸烟情况。