• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种使用结构化数据改进队列识别的自举算法。

A bootstrapping algorithm to improve cohort identification using structured data.

机构信息

Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, United States.

Department of Biomedical Informatics, University of Utah, Salt Lake City, UT, United States.

出版信息

J Biomed Inform. 2011 Dec;44 Suppl 1:S63-S68. doi: 10.1016/j.jbi.2011.10.013. Epub 2011 Nov 7.

DOI:10.1016/j.jbi.2011.10.013
PMID:22079803
Abstract

Cohort identification is an important step in conducting clinical research studies. Use of ICD-9 codes to identify disease cohorts is a common approach that can yield satisfactory results in certain conditions; however, for many use-cases more accurate methods are required. In this study, we propose a bootstrapping method that supplements ICD-9 codes with lab results, medications, etc. to build classification models that can be used to identify cohorts more accurately. The proposed method does not require prior information about the true class of the patients. We used the method to identify Diabetes Mellitus (DM) and Hyperlipidemia (HL) patient cohorts from a database of 800 thousand patients. Evaluation results show that the method identified 11,000 patients who did not have DM related ICD-9 codes as positive for DM and 52,000 patients without HL codes as positive for HL. A review of 400 patient charts (200 patients for each condition) by two clinicians shows that in both the conditions studied, the labeling assigned by the proposed approach is more consistent with that of the clinicians compared to labeling through ICD-9 codes. The method is reasonably automated and, we believe, holds potential for inexpensive, more accurate cohort identification.

摘要

队列识别是进行临床研究的重要步骤。使用 ICD-9 代码来识别疾病队列是一种常见的方法,在某些情况下可以得到满意的结果;然而,对于许多用例,需要更准确的方法。在这项研究中,我们提出了一种自举方法,用实验室结果、药物等补充 ICD-9 代码,构建分类模型,从而更准确地识别队列。所提出的方法不需要关于患者真实类别的先验信息。我们使用该方法从一个包含 80 万患者的数据库中识别出糖尿病(DM)和高脂血症(HL)患者队列。评估结果表明,该方法将 11000 名没有 DM 相关 ICD-9 代码的患者标记为 DM 阳性,将 52000 名没有 HL 代码的患者标记为 HL 阳性。两名临床医生对 400 份患者病历(每种情况 200 份)进行了审查,结果表明,在所研究的两种情况下,与通过 ICD-9 代码进行标记相比,所提出的方法所进行的标记与临床医生的标记更为一致。该方法具有一定的自动化程度,我们相信,它具有进行低成本、更准确的队列识别的潜力。

相似文献

1
A bootstrapping algorithm to improve cohort identification using structured data.一种使用结构化数据改进队列识别的自举算法。
J Biomed Inform. 2011 Dec;44 Suppl 1:S63-S68. doi: 10.1016/j.jbi.2011.10.013. Epub 2011 Nov 7.
2
A Systematic Review of Case-Identification Algorithms Based on Italian Healthcare Administrative Databases for Two Relevant Diseases of the Endocrine System: Diabetes Mellitus and Thyroid Disorders.基于意大利医疗行政数据库对内分泌系统两种相关疾病(糖尿病和甲状腺疾病)的病例识别算法的系统评价。
Epidemiol Prev. 2019 Jul-Aug;43(4 Suppl 2):17-36. doi: 10.19191/EP19.4.S2.P008.089.
3
Rule-based and machine learning algorithms identify patients with systemic sclerosis accurately in the electronic health record.基于规则和机器学习算法可在电子健康记录中准确识别系统性硬化症患者。
Arthritis Res Ther. 2019 Dec 30;21(1):305. doi: 10.1186/s13075-019-2092-7.
4
Evaluation of algorithms to identify delirium in administrative claims and drug utilization database.在行政索赔和药物利用数据库中识别谵妄的算法评估
Pharmacoepidemiol Drug Saf. 2017 Aug;26(8):945-953. doi: 10.1002/pds.4226. Epub 2017 May 9.
5
Challenges of Using ICD-9-CM and ICD-10-CM Codes for Soft-Tissue Sarcoma in Databases for Health Services Research.在卫生服务研究数据库中使用ICD - 9 - CM和ICD - 10 - CM编码对软组织肉瘤进行编码的挑战。
Perspect Health Inf Manag. 2019 Apr 1;16(Spring):1a. eCollection 2019 Spring.
6
Accuracy of the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) as a research tool for identification of patients with uveitis and scleritis.《国际疾病分类第九版临床修订本》(ICD-9-CM)作为识别葡萄膜炎和巩膜炎患者的研究工具的准确性。
Ophthalmic Epidemiol. 2015 Apr;22(2):139-41. doi: 10.3109/09286586.2015.1012274.
7
Validity of diagnostic codes and laboratory measurements to identify patients with idiopathic acute liver injury in a hospital database.利用医院数据库中的诊断编码和实验室检测结果识别特发性急性肝损伤患者的有效性
Pharmacoepidemiol Drug Saf. 2016 Mar;25 Suppl 1:21-8. doi: 10.1002/pds.3824. Epub 2015 Jul 5.
8
Validating an algorithm for multiple myeloma based on administrative data using a SEER tumor registry and medical record review.基于 SEER 肿瘤登记和病历回顾利用行政数据验证多发性骨髓瘤算法。
Pharmacoepidemiol Drug Saf. 2019 Feb;28(2):256-263. doi: 10.1002/pds.4711. Epub 2019 Feb 4.
9
Incorporating natural language processing to improve classification of axial spondyloarthritis using electronic health records.利用电子健康记录纳入自然语言处理以改善轴性脊柱关节炎的分类。
Rheumatology (Oxford). 2020 May 1;59(5):1059-1065. doi: 10.1093/rheumatology/kez375.
10
Application of a national administrative case definition for the identification of pre-existing diabetes mellitus in pregnancy.应用国家行政病例定义来识别孕期的既往糖尿病。
Chronic Dis Inj Can. 2012 Jun;32(3):113-20.

引用本文的文献

1
Validation and Improvement of a Convolutional Neural Network to Predict the Involved Pathology in a Head and Neck Surgery Cohort.验证和改进卷积神经网络以预测头颈部手术队列中的受累病理学。
Int J Environ Res Public Health. 2022 Sep 26;19(19):12200. doi: 10.3390/ijerph191912200.
2
Leveraging Healthcare System Data to Identify High-Risk Dyslipidemia Patients.利用医疗保健系统数据识别高危血脂异常患者。
Curr Cardiol Rep. 2022 Oct;24(10):1387-1396. doi: 10.1007/s11886-022-01767-5. Epub 2022 Aug 22.
3
Hybrid bag of approaches to characterize selection criteria for cohort identification.
混合方法袋,用于描述队列识别选择标准的特征。
J Am Med Inform Assoc. 2019 Nov 1;26(11):1172-1180. doi: 10.1093/jamia/ocz079.
4
Automated disease cohort selection using word embeddings from Electronic Health Records.利用电子健康记录中的词嵌入进行疾病队列自动选择。
Pac Symp Biocomput. 2018;23:145-156.
5
Using Electronic Medical Record to Identify Patients With Dyslipidemia in Primary Care Settings: International Classification of Disease Code Matters From One Region to a National Database.利用电子病历在基层医疗环境中识别血脂异常患者:从一个地区到国家数据库的国际疾病分类编码问题
Biomed Inform Insights. 2017 Feb 10;9:1178222616685880. doi: 10.1177/1178222616685880. eCollection 2017.
6
Identification of Dyslipidemic Patients Attending Primary Care Clinics Using Electronic Medical Record (EMR) Data from the Canadian Primary Care Sentinel Surveillance Network (CPCSSN) Database.利用加拿大初级保健哨点监测网络(CPCSSN)数据库中的电子病历(EMR)数据识别就诊于初级保健诊所的血脂异常患者。
J Med Syst. 2017 Mar;41(3):45. doi: 10.1007/s10916-017-0694-7. Epub 2017 Feb 10.
7
Systematic review of validated case definitions for diabetes in ICD-9-coded and ICD-10-coded data in adult populations.对成人人群中ICD - 9编码和ICD - 10编码数据中糖尿病有效病例定义的系统评价。
BMJ Open. 2016 Aug 5;6(8):e009952. doi: 10.1136/bmjopen-2015-009952.
8
Incorporating patient-reported outcome measures into the electronic health record for research: application using the Patient Health Questionnaire (PHQ-9).将患者报告的结局测量纳入电子健康记录用于研究:使用患者健康问卷(PHQ-9)的应用
Qual Life Res. 2015 Feb;24(2):295-303. doi: 10.1007/s11136-014-0764-y. Epub 2014 Aug 7.
9
A novel method for studying the temporal relationship between type 2 diabetes mellitus and cancer using the electronic medical record.一种利用电子病历研究 2 型糖尿病与癌症之间时间关系的新方法。
BMC Med Inform Decis Mak. 2014 May 9;14:38. doi: 10.1186/1472-6947-14-38.
10
Using large clinical corpora for query expansion in text-based cohort identification.利用大型临床语料库在基于文本的队列识别中进行查询扩展。
J Biomed Inform. 2014 Jun;49:275-81. doi: 10.1016/j.jbi.2014.03.010. Epub 2014 Mar 26.