Suppr超能文献

美国医疗保健理赔数据中用于识别新发非小细胞肺癌患者的编码算法的开发与验证

Development and Validation of Coding Algorithms to Identify Patients with Incident Non-Small Cell Lung Cancer in United States Healthcare Claims Data.

作者信息

Beyrer Julie, Nelson David R, Sheffield Kristin M, Huang Yu-Jing, Lau Yiu-Keung, Hincapie Ana L

机构信息

Eli Lilly and Company, Indianapolis, IN, USA.

University of Cincinnati James L. Winkle College of Pharmacy, Cincinnati, OH, USA.

出版信息

Clin Epidemiol. 2023 Jan 12;15:73-89. doi: 10.2147/CLEP.S389824. eCollection 2023.

Abstract

PURPOSE

We sought to develop and validate an incident non-small cell lung cancer (NSCLC) algorithm for United States (US) healthcare claims data. Diagnoses and procedures, but not medications, were incorporated to support longer-term relevance and reliability.

METHODS

Patients with newly diagnosed NSCLC per Surveillance, Epidemiology, and End Results (SEER) served as cases. Controls included newly diagnosed small-cell lung cancer and other lung cancers, and two 5% random samples for other cancer and without cancer. Algorithms derived from logistic regression and machine learning methods used the entire sample (Approach A) or started with a previous algorithm for those with lung cancer (Approach B). Sensitivity, specificity, positive predictive values (PPV), negative predictive values, and F-scores (compared for 1000 bootstrap samples) were calculated. Misclassification was evaluated by calculating the odds of selection by the algorithm among true positives and true negatives.

RESULTS

The best performing algorithm utilized neural networks (Approach B). A 10-variable point-score algorithm was derived from logistic regression (Approach B); sensitivity was 77.69% and PPV = 67.61% (F-score = 72.30%). This algorithm was less sensitive for patients ≥80 years old, with Medicare follow-up time <3 months, or missing SEER data on stage, laterality, or site and less specific for patients with SEER primary site of main bronchus, SEER summary stage 2000 regional by direct extension only, or pre-index chronic pulmonary disease.

CONCLUSION

Our study developed and validated a practical, 10-variable, point-based algorithm for identifying incident NSCLC cases in a US claims database based on a previously validated incident lung cancer algorithm.

摘要

目的

我们试图开发并验证一种针对美国医疗保健理赔数据的非小细胞肺癌(NSCLC)发病算法。纳入了诊断和手术信息,但未纳入药物信息,以确保算法具有长期相关性和可靠性。

方法

根据监测、流行病学和最终结果(SEER)数据库中确诊为NSCLC的患者作为病例组。对照组包括新诊断的小细胞肺癌和其他肺癌患者,以及两个分别为5%的其他癌症患者随机样本和无癌症患者随机样本。从逻辑回归和机器学习方法得出的算法,使用了整个样本(方法A),或者从先前针对肺癌患者的算法开始(方法B)。计算了敏感度、特异度、阳性预测值(PPV)、阴性预测值和F值(针对1000个自助抽样样本进行比较)。通过计算算法在真阳性和真阴性中选择的概率来评估错误分类情况。

结果

表现最佳的算法采用了神经网络(方法B)。从逻辑回归得出了一个包含10个变量的评分算法(方法B);敏感度为77.69%,PPV = 67.61%(F值 = 72.30%)。该算法对80岁及以上患者、医疗保险随访时间少于3个月的患者,或在分期、肺叶或部位方面缺少SEER数据的患者敏感度较低,而对SEER主要部位为主支气管、SEER总结分期仅为2000年区域直接扩展期,或索引前患有慢性肺病的患者特异度较低。

结论

我们的研究基于先前验证的肺癌发病算法,开发并验证了一种实用的、包含10个变量的、基于点数的算法,用于在美国理赔数据库中识别NSCLC发病病例。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b1d1/9842515/ec5ec00fbad3/CLEP-15-73-g0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验