Suppr超能文献

缺乏纵向数据限制了高通量临床表型分析用于识别 2 型糖尿病患者的准确性。

The absence of longitudinal data limits the accuracy of high-throughput clinical phenotyping for identifying type 2 diabetes mellitus subjects.

机构信息

Institute for Health Informatics, University of Minnesota, Twin Cities, MN, USA.

出版信息

Int J Med Inform. 2013 Apr;82(4):239-47. doi: 10.1016/j.ijmedinf.2012.05.015. Epub 2012 Jul 2.

Abstract

PURPOSE

To evaluate the impact of insufficient longitudinal data on the accuracy of a high-throughput clinical phenotyping (HTCP) algorithm for identifying (1) patients with type 2 diabetes mellitus (T2DM) and (2) patients with no diabetes.

METHODS

Retrospective study conducted at Mayo Clinic in Rochester, Minnesota. Eligible subjects were Olmsted County residents with ≥1 Mayo Clinic encounter in each of three time periods: (1) 2007, (2) from 1997 through 2006, and (3) before 1997 (N = 54,283). Diabetes relevant electronic medical record (EMR) data about diagnoses, laboratories, and medications were used. We employed the HTCP algorithm to categorize individuals as T2DM cases and non-diabetes controls. Considering the full 11 years (1997-2007) as the gold standard, we compared gold-standard categorizations with those using data for 10 subsequent intervals, ranging from 1998-2007 (10-year data) to 2007 (1-year data). Positive predictive values (PPVs) and false-negative rates (FNRs) were calculated. McNemar tests were used to determine whether categorizations using shorter time periods differed from the gold standard. Statistical significance was defined as P < 0.05.

RESULTS

We identified 2770 T2DM cases and 21,005 controls when the algorithm was applied using 11-year data. Using 2007 data alone, PPVs and FNRs, respectively, were 70% and 25% for case identification and 59% and 67% for control identification. All time frames differed significantly from the gold standard, except for the 10-year period.

CONCLUSIONS

The accuracy of the algorithm reduced remarkably as data were limited to shorter observation periods. This impact should be considered carefully when designing/executing HTCP algorithms.

摘要

目的

评估纵向数据不足对高通量临床表型分析(HTCP)算法识别(1)2 型糖尿病(T2DM)患者和(2)无糖尿病患者的准确性的影响。

方法

这是一项在明尼苏达州罗切斯特市梅奥诊所进行的回顾性研究。合格的研究对象为奥姆斯特德县居民,他们在三个时间段内至少有一次梅奥诊所就诊记录:(1)2007 年,(2)1997 年至 2006 年,以及(3)1997 年之前(N=54283)。使用与糖尿病相关的电子病历(EMR)数据,包括诊断、实验室检查和药物治疗。我们采用 HTCP 算法将个体归类为 T2DM 病例和非糖尿病对照组。考虑到完整的 11 年(1997-2007 年)作为金标准,我们将金标准分类与使用接下来 10 个时间区间的数据(1998-2007 年[10 年数据]至 2007 年[1 年数据])进行比较。计算阳性预测值(PPV)和假阴性率(FNR)。采用 McNemar 检验比较使用较短时间区间的分类与金标准是否存在差异。统计学显著性定义为 P<0.05。

结果

当使用 11 年数据应用算法时,我们确定了 2770 例 T2DM 病例和 21005 例对照。仅使用 2007 年的数据,病例识别的 PPV 和 FNR 分别为 70%和 25%,对照组识别的 PPV 和 FNR 分别为 59%和 67%。除 10 年时间区间外,所有时间框架均与金标准有显著差异。

结论

随着数据被限制在较短的观察期内,算法的准确性显著降低。在设计/执行 HTCP 算法时,应仔细考虑这一影响。

相似文献

3
Development of Type 2 Diabetes Mellitus Phenotyping Framework Using Expert Knowledge and Machine Learning Approach.
J Diabetes Sci Technol. 2017 Jul;11(4):791-799. doi: 10.1177/1932296816681584. Epub 2016 Dec 7.
4
A machine learning-based framework to identify type 2 diabetes through electronic health records.
Int J Med Inform. 2017 Jan;97:120-127. doi: 10.1016/j.ijmedinf.2016.09.014. Epub 2016 Oct 1.
5
Validating an ontology-based algorithm to identify patients with type 2 diabetes mellitus in electronic health records.
Int J Med Inform. 2014 Oct;83(10):768-78. doi: 10.1016/j.ijmedinf.2014.06.002. Epub 2014 Jun 20.
6
Diabetes and hypertension in isolated sixth nerve palsy: a population-based study.
Ophthalmology. 2005 May;112(5):760-3. doi: 10.1016/j.ophtha.2004.11.057.
7
Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study.
J Am Med Inform Assoc. 2012 Mar-Apr;19(2):212-8. doi: 10.1136/amiajnl-2011-000439. Epub 2011 Nov 19.

引用本文的文献

1
Beyond Phecodes: leveraging PheMAP to identify patients lacking diagnosis codes in electronic health records.
J Am Med Inform Assoc. 2025 Jun 1;32(6):1007-1014. doi: 10.1093/jamia/ocaf055.
3
A case study in applying artificial intelligence-based named entity recognition to develop an automated ophthalmic disease registry.
Graefes Arch Clin Exp Ophthalmol. 2023 Nov;261(11):3335-3344. doi: 10.1007/s00417-023-06190-2. Epub 2023 Aug 3.
6
Constructing Epidemiologic Cohorts from Electronic Health Record Data.
Int J Environ Res Public Health. 2021 Dec 14;18(24):13193. doi: 10.3390/ijerph182413193.
9
A Decision Support System for Diabetes Chronic Care Models Based on General Practitioner Engagement and EHR Data Sharing.
IEEE J Transl Eng Health Med. 2020 Oct 14;8:3000112. doi: 10.1109/JTEHM.2020.3031107. eCollection 2020.

本文引用的文献

2
Use of diverse electronic medical record systems to identify genetic risk for type 2 diabetes within a genome-wide association study.
J Am Med Inform Assoc. 2012 Mar-Apr;19(2):212-8. doi: 10.1136/amiajnl-2011-000439. Epub 2011 Nov 19.
3
Type 2 diabetes and obesity: genomics and the clinic.
Hum Genet. 2011 Jul;130(1):41-58. doi: 10.1007/s00439-011-1023-8. Epub 2011 Jun 7.
4
Using electronic health records to drive discovery in disease genomics.
Nat Rev Genet. 2011 Jun;12(6):417-28. doi: 10.1038/nrg2999. Epub 2011 May 18.
8
The emerging role of electronic medical records in pharmacogenomics.
Clin Pharmacol Ther. 2011 Mar;89(3):379-86. doi: 10.1038/clpt.2010.260. Epub 2011 Jan 19.
9
Genomics, type 2 diabetes, and obesity.
N Engl J Med. 2010 Dec 9;363(24):2339-50. doi: 10.1056/NEJMra0906948.
10

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验