利用多变量模型可提高行政数据库识别疾病队列的有用性。

The usefulness of administrative databases for identifying disease cohorts is increased with a multivariate model.

机构信息

Clinical Epidemiology Program, Ottawa Hospital Research Institute, F660-1053 Carling Avenue, Ottawa, Ontario, Canada.

出版信息

J Clin Epidemiol. 2010 Dec;63(12):1332-41. doi: 10.1016/j.jclinepi.2010.01.016. Epub 2010 May 8.

DOI:10.1016/j.jclinepi.2010.01.016

PMID:20457509

Abstract

BACKGROUND

Administrative databases commonly use codes to indicate diagnoses. These codes alone are often inadequate to accurately identify patients with particular conditions. In this study, we determined whether we could quantify the probability that a person has a particular disease-in this case renal failure-using other routinely collected information available in an administrative data set. This would allow the accurate identification of a disease cohort in an administrative database.

METHODS

We determined whether patients in a randomly selected 100,000 hospitalizations had kidney disease (defined as two or more sequential serum creatinines or the single admission creatinine indicating a calculated glomerular filtration rate less than 60 mL/min/1.73 m²). The independent association of patient- and hospitalization-level variables with renal failure was measured using a multivariate logistic regression model in a random 50% sample of the patients. The model was validated in the remaining patients.

RESULTS

Twenty thousand seven hundred thirteen patients had kidney disease (20.7%). A diagnostic code of kidney disease was strongly associated with kidney disease (relative risk: 34.4), but the accuracy of the code was poor (sensitivity: 37.9%; specificity: 98.9%). Twenty-nine patient- and hospitalization-level variables entered the kidney disease model. This model had excellent discrimination (c-statistic: 90.1%) and accurately predicted the probability of true renal failure. The probability threshold that maximized sensitivity and specificity for the identification of true kidney disease was 21.3% (sensitivity: 80.0%; specificity: 82.2%).

CONCLUSION

Multiple variables available in administrative databases can be combined to quantify the probability that a person has a particular disease. This process permits accurate identification of a disease cohort in an administrative database. These methods may be extended to other diagnoses or procedures and could both facilitate and clarify the use of administrative databases for research and quality improvement.

摘要

背景

行政数据库通常使用代码来表示诊断。这些代码本身通常不足以准确识别特定疾病的患者。在这项研究中，我们确定是否可以使用行政数据集内其他常规收集的信息来量化一个人患有特定疾病（在这种情况下是肾衰竭）的可能性。这将允许在行政数据库中准确识别疾病队列。

方法

我们确定了随机选择的 100,000 例住院患者中是否患有肾脏疾病（定义为两次或更多次连续血清肌酐或单次入院肌酐表明计算的肾小球滤过率小于 60 mL/min/1.73 m²）。使用多变量逻辑回归模型在患者的随机 50%样本中测量患者和住院水平变量与肾衰竭的独立关联。在其余患者中验证了该模型。

结果

2713 例患者患有肾脏疾病（20.7%）。肾脏疾病的诊断代码与肾脏疾病密切相关（相对风险：34.4），但该代码的准确性较差（灵敏度：37.9%；特异性：98.9%）。29 个患者和住院水平变量进入了肾脏疾病模型。该模型具有出色的区分能力（c 统计量：90.1%），并准确预测了真正肾衰竭的概率。最大程度提高识别真正肾脏疾病的敏感性和特异性的概率阈值为 21.3%（灵敏度：80.0%；特异性：82.2%）。

结论

行政数据库中可用的多个变量可以组合起来量化一个人患有特定疾病的可能性。这个过程允许在行政数据库中准确识别疾病队列。这些方法可以扩展到其他诊断或程序，并且可以为研究和质量改进提供便利和澄清行政数据库的使用。

相似文献

The usefulness of administrative databases for identifying disease cohorts is increased with a multivariate model.

J Clin Epidemiol. 2010 Dec;63(12):1332-41. doi: 10.1016/j.jclinepi.2010.01.016. Epub 2010 May 8.

[Reliability of administrative databases in epidemiological research: the example of end-stage renal disease requiring renal replacement therapy in patients with diabetes].

G Ital Nefrol. 2009 Mar-Apr;26 Suppl 45:S7-11.

Defining acute kidney injury in database studies: the effects of varying the baseline kidney function assessment period and considering CKD status.

Am J Kidney Dis. 2010 Oct;56(4):651-60. doi: 10.1053/j.ajkd.2010.05.011. Epub 2010 Jul 29.

Robustness of prevalence estimates derived from misclassified data from administrative databases.

Biometrics. 2007 Mar;63(1):272-9. doi: 10.1111/j.1541-0420.2006.00665.x.

Determinants of hospitalization in a cohort of chronic dialysis patients in central Italy.

J Nephrol. 2005 Jan-Feb;18(1):21-9.

Physician-diagnosed depression as a correlate of hospitalizations in patients receiving long-term hemodialysis.

Am J Kidney Dis. 2005 Oct;46(4):642-9. doi: 10.1053/j.ajkd.2005.07.002.

How accurate is ICD coding for epilepsy?

Epilepsia. 2010 Jan;51(1):62-9. doi: 10.1111/j.1528-1167.2009.02201.x. Epub 2009 Jul 20.

Positive predictive value of ICD-9 codes 410 and 411 in the identification of cases of acute coronary syndromes in the Saskatchewan Hospital automated database.

Pharmacoepidemiol Drug Saf. 2008 Aug;17(8):842-52. doi: 10.1002/pds.1619.

Development and validation of a case definition for epilepsy for use with administrative health data.

Epilepsy Res. 2012 Dec;102(3):173-9. doi: 10.1016/j.eplepsyres.2012.05.009. Epub 2012 Jun 22.

Administrative database research infrequently used validated diagnostic or procedural codes.

J Clin Epidemiol. 2011 Oct;64(10):1054-9. doi: 10.1016/j.jclinepi.2011.01.001. Epub 2011 Apr 6.

引用本文的文献

Accuracy of routinely collected hospital administrative discharge data and death certificate ICD-10 diagnostic coding in progressive supranuclear palsy and corticobasal syndrome: a systematic review and validation study.

J Neurol. 2024 Jun;271(6):2929-2937. doi: 10.1007/s00415-024-12280-w. Epub 2024 Apr 12.

Contraindications to use of neuraxial anesthesia for lower limb revascularization surgery in adults: a cross-sectional study.

Can J Anaesth. 2024 Jun;71(6):808-817. doi: 10.1007/s12630-023-02546-8. Epub 2023 Jul 27.

A population-based study to develop juvenile arthritis case definitions for administrative health data using model-based dynamic classification.

BMC Med Res Methodol. 2021 May 16;21(1):105. doi: 10.1186/s12874-021-01296-9.

A systematic review of database validation studies among fertility populations.

Hum Reprod Open. 2019 Jun 6;2019(3):hoz010. doi: 10.1093/hropen/hoz010. eCollection 2019.

Routine primary care data for scientific research, quality of care programs and educational purposes: the Julius General Practitioners' Network (JGPN).

BMC Health Serv Res. 2018 Sep 25;18(1):735. doi: 10.1186/s12913-018-3528-5.

Subarachnoid hemorrhage admissions retrospectively identified using a prediction model.

Neurology. 2016 Oct 11;87(15):1557-1564. doi: 10.1212/WNL.0000000000003204. Epub 2016 Sep 14.

Validation of Diagnostic Groups Based on Health Care Utilization Data Should Adjust for Sampling Strategy.

Med Care. 2017 Aug;55(8):e59-e67. doi: 10.1097/MLR.0000000000000324.

Combining structured and unstructured data to identify a cohort of ICU patients who received dialysis.

J Am Med Inform Assoc. 2014 Sep-Oct;21(5):801-7. doi: 10.1136/amiajnl-2013-001915. Epub 2014 Jan 2.

A data mining approach for grouping and analyzing trajectories of care using claim data: the example of breast cancer.

BMC Med Inform Decis Mak. 2013 Nov 30;13:130. doi: 10.1186/1472-6947-13-130.

Consideration of ICD-9 code-derived disease-specific safety indicators in CKD.

Clin J Am Soc Nephrol. 2013 Dec;8(12):2123-31. doi: 10.2215/CJN.12671212. Epub 2013 Sep 19.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用多变量模型可提高行政数据库识别疾病队列的有用性。

The usefulness of administrative databases for identifying disease cohorts is increased with a multivariate model.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献