Suppr超能文献

利用自然语言处理技术在不进行筛查的情况下开发一种预测轻度认知障碍的机器学习模型。

Development of a machine learning model to predict mild cognitive impairment using natural language processing in the absence of screening.

机构信息

Kaiser Permanente Washington Health Research Institute, 1730 Minor Ave., Suite 1600, Seattle, WA, 98101, USA.

Janssen Research and Development, LLC, Raritan, USA.

出版信息

BMC Med Inform Decis Mak. 2022 May 12;22(1):129. doi: 10.1186/s12911-022-01864-z.

Abstract

BACKGROUND

Patients and their loved ones often report symptoms or complaints of cognitive decline that clinicians note in free clinical text, but no structured screening or diagnostic data are recorded. These symptoms/complaints may be signals that predict who will go on to be diagnosed with mild cognitive impairment (MCI) and ultimately develop Alzheimer's Disease or related dementias. Our objective was to develop a natural language processing system and prediction model for identification of MCI from clinical text in the absence of screening or other structured diagnostic information.

METHODS

There were two populations of patients: 1794 participants in the Adult Changes in Thought (ACT) study and 2391 patients in the general population of Kaiser Permanente Washington. All individuals had standardized cognitive assessment scores. We excluded patients with a diagnosis of Alzheimer's Disease, Dementia or use of donepezil. We manually annotated 10,391 clinic notes to train the NLP model. Standard Python code was used to extract phrases from notes and map each phrase to a cognitive functioning concept. Concepts derived from the NLP system were used to predict future MCI. The prediction model was trained on the ACT cohort and 60% of the general population cohort with 40% withheld for validation. We used a least absolute shrinkage and selection operator logistic regression approach (LASSO) to fit a prediction model with MCI as the prediction target. Using the predicted case status from the LASSO model and known MCI from standardized scores, we constructed receiver operating curves to measure model performance.

RESULTS

Chart abstraction identified 42 MCI concepts. Prediction model performance in the validation data set was modest with an area under the curve of 0.67. Setting the cutoff for correct classification at 0.60, the classifier yielded sensitivity of 1.7%, specificity of 99.7%, PPV of 70% and NPV of 70.5% in the validation cohort.

DISCUSSION AND CONCLUSION

Although the sensitivity of the machine learning model was poor, negative predictive value was high, an important characteristic of models used for population-based screening. While an AUC of 0.67 is generally considered moderate performance, it is also comparable to several tests that are widely used in clinical practice.

摘要

背景

患者及其家属经常报告认知能力下降的症状或抱怨,临床医生在自由临床文本中注意到这些症状,但没有记录任何结构化的筛查或诊断数据。这些症状/抱怨可能是预示谁将被诊断为轻度认知障碍(MCI)并最终发展为阿尔茨海默病或相关痴呆的信号。我们的目标是开发一种自然语言处理系统和预测模型,以便在没有筛查或其他结构化诊断信息的情况下从临床文本中识别 MCI。

方法

有两个患者群体:1794 名成人思维变化(ACT)研究参与者和 2391 名 Kaiser Permanente Washington 普通人群患者。所有个体都有标准化的认知评估评分。我们排除了患有阿尔茨海默病、痴呆或使用多奈哌齐的患者。我们手动注释了 10391 份诊所记录来训练 NLP 模型。标准 Python 代码用于从笔记中提取短语,并将每个短语映射到认知功能概念。从 NLP 系统中获得的概念用于预测未来的 MCI。预测模型在 ACT 队列和普通人群队列的 60%上进行训练,40%被保留用于验证。我们使用最小绝对收缩和选择算子逻辑回归方法(LASSO)来拟合以 MCI 为预测目标的预测模型。使用 LASSO 模型预测的病例状态和标准化评分中的已知 MCI,我们构建了接收者操作曲线来衡量模型性能。

结果

图表抽象确定了 42 个 MCI 概念。验证数据集的预测模型性能中等,曲线下面积为 0.67。将正确分类的截止值设置为 0.60,分类器在验证队列中的灵敏度为 1.7%,特异性为 99.7%,PPV 为 70%,NPV 为 70.5%。

讨论与结论

尽管机器学习模型的灵敏度较差,但阴性预测值较高,这是用于基于人群的筛查的模型的重要特征。虽然 0.67 的 AUC 通常被认为是中等性能,但它也与许多在临床实践中广泛使用的测试相当。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fee/9097352/1ed0115fc86f/12911_2022_1864_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验