• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于痴呆症检测的双流算法:利用结构化和非结构化电子健康记录数据,一种估计患病率的新方法。

Dual-stream algorithms for dementia detection: Harnessing structured and unstructured electronic health record data, a novel approach to prevalence estimation.

作者信息

Collyer Taya A, Liu Ming, Beare Richard, Andrew Nadine E, Ung David, Carver Alison, Ilomaki Jenni, Bell J Simon, Thrift Amanda G, Rocca Walter A, St Sauver Jennifer L, Lu Alicia, Siostrom Kristy, Moran Chris, Roberts Helene, Chong Trevor T-J, Murray Anne, Ravipati Tanya, O'Bree Bridget, Srikanth Velandai K

机构信息

National Centre for Healthy Ageing, Frankston, Victoria, Australia.

Peninsula Clinical School, School of Translational Medicine, Monash University, Frankston, Victoria, Australia.

出版信息

Alzheimers Dement. 2025 May;21(5):e70132. doi: 10.1002/alz.70132.

DOI:10.1002/alz.70132
PMID:40325920
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12053150/
Abstract

INTRODUCTION

Identifying individuals with dementia is crucial for prevalence estimation and service planning, but reliable, scalable methods are lacking. We developed novel set algorithms using both structured and unstructured electronic health record (EHR) data, applying Diagnostic and Statistical Manual of Mental Disorders criteria for dementia case identification.

METHODS

Our cohort (n = 1082) included individuals aged ≥ 60 with dementia identified through specialist clinics and a comparison group without dementia. Clinicians from Australia and the United States informed predictor selection. We developed algorithms through a biostatistics stream for structured data and a natural language processing (NLP) stream for text, synthesizing results via logistic regression.

RESULTS

The final structured model retained 16 variables (area under the receiver operating characteristic curve [AUC] 0.853, specificity 72.2%, sensitivity 80.6%). NLP classifiers (logistic regression, support vector machine, and random forest models) performed comparably. The final, combined model outperformed all others (AUC = 0.951, P < 0.001 for comparison to structured model).

DISCUSSION

Embedding text-derived insights within algorithms trained on structured medical data significantly enhances dementia identification capacity.

HIGHLIGHTS

Algorithmic tools for detection of individuals with dementia are available; however, previous work has used heterogeneous case definitions which are not clinically meaningful, and has relied on proxies such as diagnostic codes or medications for case ascertainment. We used a novel, dual-stream algorithmic development approach, simultaneously and separately modeling a clinically meaningful outcome (diagnosis of dementia according to specialized clinical impression) using structured and unstructured electronic health record datasets. Our clinically grounded case definition supported the inclusion of key structured variables (such as dementia International Classification of Disease codes and medications) as modeling predictors rather than outcomes. Our algorithms, published in detail to support validation and replication, represent a major step forward in the use of routinely collected data for detection of diagnosed dementia.

摘要

引言

识别痴呆症患者对于患病率估计和服务规划至关重要,但缺乏可靠且可扩展的方法。我们利用结构化和非结构化电子健康记录(EHR)数据开发了新颖的集算法,应用《精神疾病诊断与统计手册》标准来识别痴呆症病例。

方法

我们的队列(n = 1082)包括通过专科诊所确诊的≥60岁痴呆症患者以及无痴呆症的对照组。来自澳大利亚和美国的临床医生参与了预测指标的选择。我们通过生物统计学流程开发结构化数据算法,通过自然语言处理(NLP)流程开发文本算法,并通过逻辑回归综合结果。

结果

最终的结构化模型保留了16个变量(受试者操作特征曲线下面积[AUC]为0.853,特异性为72.2%,敏感性为80.6%)。NLP分类器(逻辑回归、支持向量机和随机森林模型)表现相当。最终的组合模型优于所有其他模型(AUC = 0.951,与结构化模型相比P < 0.001)。

讨论

将文本衍生的见解融入基于结构化医疗数据训练的算法中,可显著提高痴呆症识别能力。

要点

现有用于检测痴呆症患者的算法工具;然而,以往的工作使用的病例定义异质性大且缺乏临床意义,并且依赖诊断代码或药物等替代指标来确定病例。我们采用了一种新颖的双流算法开发方法,同时并分别使用结构化和非结构化电子健康记录数据集对具有临床意义的结果(根据专业临床印象诊断痴呆症)进行建模。我们基于临床的病例定义支持将关键的结构化变量(如痴呆症国际疾病分类代码和药物)作为建模预测指标而非结果纳入。我们详细发表的算法以支持验证和复制,代表了在利用常规收集的数据检测已确诊痴呆症方面向前迈出的重要一步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a3ce/12053150/3fc6d1fc56ee/ALZ-21-e70132-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a3ce/12053150/96c3cb2989f1/ALZ-21-e70132-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a3ce/12053150/926eb2986411/ALZ-21-e70132-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a3ce/12053150/3fc6d1fc56ee/ALZ-21-e70132-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a3ce/12053150/96c3cb2989f1/ALZ-21-e70132-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a3ce/12053150/926eb2986411/ALZ-21-e70132-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a3ce/12053150/3fc6d1fc56ee/ALZ-21-e70132-g001.jpg

相似文献

1
Dual-stream algorithms for dementia detection: Harnessing structured and unstructured electronic health record data, a novel approach to prevalence estimation.用于痴呆症检测的双流算法:利用结构化和非结构化电子健康记录数据,一种估计患病率的新方法。
Alzheimers Dement. 2025 May;21(5):e70132. doi: 10.1002/alz.70132.
2
The Value of Unstructured Electronic Health Record Data in Geriatric Syndrome Case Identification.非结构化电子健康记录数据在老年综合征病例识别中的价值。
J Am Geriatr Soc. 2018 Aug;66(8):1499-1507. doi: 10.1111/jgs.15411. Epub 2018 Jul 4.
3
Development of an automated phenotyping algorithm for hepatorenal syndrome.开发用于肝肾综合征的自动表型算法。
J Biomed Inform. 2018 Apr;80:87-95. doi: 10.1016/j.jbi.2018.03.001. Epub 2018 Mar 9.
4
Augmented intelligence with natural language processing applied to electronic health records for identifying patients with non-alcoholic fatty liver disease at risk for disease progression.应用自然语言处理的增强型人工智能用于电子健康记录,以识别非酒精性脂肪性肝病患者中疾病进展风险较高的患者。
Int J Med Inform. 2019 Sep;129:334-341. doi: 10.1016/j.ijmedinf.2019.06.028. Epub 2019 Jul 6.
5
Systematic Identification of Caregivers of Patients Living With Dementia in the Electronic Health Record: Known Contacts and Natural Language Processing Cohort Study.在电子健康记录中系统识别痴呆症患者的照护者:已知联系人与自然语言处理队列研究
J Med Internet Res. 2025 May 5;27:e63654. doi: 10.2196/63654.
6
Detection of probable dementia cases in undiagnosed patients using structured and unstructured electronic health records.使用结构化和非结构化电子健康记录检测未确诊患者中的可能痴呆病例。
BMC Med Inform Decis Mak. 2019 Jul 9;19(1):128. doi: 10.1186/s12911-019-0846-4.
7
Using natural language processing to identify opioid use disorder in electronic health record data.利用自然语言处理技术在电子健康记录数据中识别阿片类药物使用障碍。
Int J Med Inform. 2023 Feb;170:104963. doi: 10.1016/j.ijmedinf.2022.104963. Epub 2022 Dec 10.
8
Classifying Unstructured Text in Electronic Health Records for Mental Health Prediction Models: Large Language Model Evaluation Study.用于心理健康预测模型的电子健康记录中非结构化文本分类:大语言模型评估研究
JMIR Med Inform. 2025 Jan 21;13:e65454. doi: 10.2196/65454.
9
A method for cohort selection of cardiovascular disease records from an electronic health record system.一种从电子健康记录系统中选择心血管疾病记录队列的方法。
Int J Med Inform. 2017 Jun;102:138-149. doi: 10.1016/j.ijmedinf.2017.03.015. Epub 2017 Mar 30.
10
Classifying social determinants of health from unstructured electronic health records using deep learning-based natural language processing.利用基于深度学习的自然语言处理技术从非结构化电子健康记录中分类社会健康决定因素。
J Biomed Inform. 2022 Mar;127:103984. doi: 10.1016/j.jbi.2021.103984. Epub 2022 Jan 7.

本文引用的文献

1
Beyond electronic health record data: leveraging natural language processing and machine learning to uncover cognitive insights from patient-nurse verbal communications.超越电子健康记录数据:利用自然语言处理和机器学习从患者与护士的言语交流中挖掘认知见解。
J Am Med Inform Assoc. 2025 Feb 1;32(2):328-340. doi: 10.1093/jamia/ocae300.
2
Identifying Functional Status Impairment in People Living With Dementia Through Natural Language Processing of Clinical Documents: Cross-Sectional Study.通过对临床文档的自然语言处理识别痴呆患者的功能状态障碍:横断面研究。
J Med Internet Res. 2024 Feb 13;26:e47739. doi: 10.2196/47739.
3
Recognition of cognitive dysfunction in hospitalised older patients: a flash mob study.
住院老年患者认知功能障碍的识别:快闪族研究。
BMC Geriatr. 2024 Jan 16;24(1):66. doi: 10.1186/s12877-023-04588-5.
4
Developing a linked electronic health record derived data platform to support research into healthy ageing.开发一个关联的电子健康记录衍生数据平台,以支持健康老龄化的研究。
Int J Popul Data Sci. 2023 Jun 12;8(1):2129. doi: 10.23889/ijpds.v8i1.2129. eCollection 2023.
5
The population effect of a national policy to incentivize chronic disease management in primary care in stroke: a population-based cohort study using an emulated target trial approach.一项激励初级保健机构进行中风慢性病管理的国家政策对人群的影响:一项采用模拟目标试验方法的基于人群的队列研究。
Lancet Reg Health West Pac. 2023 Mar 10;34:100723. doi: 10.1016/j.lanwpc.2023.100723. eCollection 2023 May.
6
Can Patients with Dementia Be Identified in Primary Care Electronic Medical Records Using Natural Language Processing?能否使用自然语言处理在初级保健电子病历中识别痴呆症患者?
J Healthc Inform Res. 2023 Jan 23;7(1):42-58. doi: 10.1007/s41666-023-00125-6. eCollection 2023 Mar.
7
Different estimates of the prevalence of dementia in Australia, 2021.2021年澳大利亚痴呆症患病率的不同估计值。
Med J Aust. 2023 Apr 17;218(7):320-321. doi: 10.5694/mja2.51838. Epub 2023 Jan 31.
8
Diagnostic accuracy of linked administrative data for dementia diagnosis in community-dwelling older men in Australia.澳大利亚社区居住的老年男性人群中,基于关联行政数据的痴呆症诊断准确性。
BMC Geriatr. 2022 Nov 15;22(1):858. doi: 10.1186/s12877-022-03579-2.
9
Differing Methodologies Are Required to Estimate Prevalence of Dementia: Single Study Types Are No Longer Reliable.需要采用不同的方法来估计痴呆症的患病率:单一的研究类型不再可靠。
J Alzheimers Dis. 2022;88(3):943-948. doi: 10.3233/JAD-220093.
10
On the Convergence of Epidemiology, Biostatistics, and Data Science.论流行病学、生物统计学与数据科学的融合
Harv Data Sci Rev. 2020 Spring;2(2). doi: 10.1162/99608f92.9f0215e6. Epub 2020 Apr 30.