Suppr超能文献

利用临床记录序列对阿尔茨海默病进行时间特征刻画。

Temporal characterization of Alzheimer's Disease with sequences of clinical records.

机构信息

Department of Medicine, Massachusetts General Hospital, Boston, MA, USA.

Department of Medicine, Massachusetts General Hospital, Boston, MA, USA; Harvard Medical School, Harvard-MIT Program in Health Sciences and Technology, USA.

出版信息

EBioMedicine. 2023 Jun;92:104629. doi: 10.1016/j.ebiom.2023.104629. Epub 2023 May 27.

Abstract

BACKGROUND

Alzheimer's Disease (AD) is a complex clinical phenotype with unprecedented social and economic tolls on an ageing global population. Real-world data (RWD) from electronic health records (EHRs) offer opportunities to accelerate precision drug development and scale epidemiological research on AD. A precise characterization of AD cohorts is needed to address the noise abundant in RWD.

METHODS

We conducted a retrospective cohort study to develop and test computational models for AD cohort identification using clinical data from 8 Massachusetts healthcare systems. We mined temporal representations from EHR data using the transitive sequential pattern mining algorithm (tSPM) to train and validate our models. We then tested our models against a held-out test set from a review of medical records to adjudicate the presence of AD. We trained two classes of Machine Learning models, using Gradient Boosting Machine (GBM), to compare the utility of AD diagnosis records versus the tSPM temporal representations (comprising sequences of diagnosis and medication observations) from electronic medical records for characterizing AD cohorts.

FINDINGS

In a group of 4985 patients, we identified 219 tSPM temporal representations (i.e., transitive sequences) of medical records for constructing the best classification models. The models with sequential features improved AD classification by a magnitude of 3-16 percent over the use of AD diagnosis codes alone. The computed cohort included 663 patients, 35 of whom had no record of AD. Six groups of tSPM sequences were identified for characterizing the AD cohorts.

INTERPRETATION

We present sequential patterns of diagnosis and medication codes from electronic medical records, as digital markers of Alzheimer's Disease. Classification algorithms developed on sequential patterns can replace standard features from EHRs to enrich phenotype modelling.

FUNDING

National Institutes of Health: the National Institute on Aging (RF1AG074372) and the National Institute of Allergy and Infectious Diseases (R01AI165535).

摘要

背景

阿尔茨海默病(AD)是一种复杂的临床表型,给全球老龄化人口带来了前所未有的社会和经济负担。来自电子健康记录(EHR)的真实世界数据(RWD)为加速精准药物开发和扩大 AD 的流行病学研究提供了机会。为了解决 RWD 中大量存在的噪声,需要对 AD 队列进行精确描述。

方法

我们进行了一项回顾性队列研究,使用来自马萨诸塞州 8 个医疗保健系统的临床数据,开发和测试使用计算模型识别 AD 队列。我们使用传递序贯模式挖掘算法(tSPM)从 EHR 数据中挖掘时间表示,以训练和验证我们的模型。然后,我们使用从病历审查中获得的测试集来测试我们的模型,以裁决 AD 的存在。我们使用梯度提升机(GBM)训练了两类机器学习模型,比较了 AD 诊断记录与电子病历中的 tSPM 时间表示(包含诊断和药物观察序列)用于描述 AD 队列的效用。

结果

在一组 4985 名患者中,我们确定了 219 个 tSPM 时间记录(即传递序列),用于构建最佳分类模型。与单独使用 AD 诊断代码相比,具有序列特征的模型将 AD 分类的准确率提高了 3-16 个百分点。计算出的队列包括 663 名患者,其中 35 名患者没有 AD 记录。为了描述 AD 队列,我们确定了 6 组 tSPM 序列。

解释

我们提出了电子病历中诊断和药物代码的序贯模式,作为阿尔茨海默病的数字标志物。基于序贯模式开发的分类算法可以替代 EHR 中的标准特征,从而丰富表型建模。

资金来源

美国国立卫生研究院:美国国家老龄化研究所(RF1AG074372)和美国国家过敏和传染病研究所(R01AI165535)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f47/10236187/472a4f21c5e7/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验