Suppr超能文献

利用文本挖掘技术从非结构化电子健康记录中进行冠状动脉疾病风险评估。

Coronary artery disease risk assessment from unstructured electronic health records using text mining.

作者信息

Jonnagaddala Jitendra, Liaw Siaw-Teng, Ray Pradeep, Kumar Manish, Chang Nai-Wen, Dai Hong-Jie

机构信息

School of Public Health and Community Medicine, University of New South Wales, Australia; Asia-Pacific Ubiquitous Healthcare Research Centre, University of New South Wales, Australia; Prince of Wales Clinical School, University of New South Wales, Australia.

School of Public Health and Community Medicine, University of New South Wales, Australia.

出版信息

J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S203-S210. doi: 10.1016/j.jbi.2015.08.003. Epub 2015 Aug 28.

Abstract

Coronary artery disease (CAD) often leads to myocardial infarction, which may be fatal. Risk factors can be used to predict CAD, which may subsequently lead to prevention or early intervention. Patient data such as co-morbidities, medication history, social history and family history are required to determine the risk factors for a disease. However, risk factor data are usually embedded in unstructured clinical narratives if the data is not collected specifically for risk assessment purposes. Clinical text mining can be used to extract data related to risk factors from unstructured clinical notes. This study presents methods to extract Framingham risk factors from unstructured electronic health records using clinical text mining and to calculate 10-year coronary artery disease risk scores in a cohort of diabetic patients. We developed a rule-based system to extract risk factors: age, gender, total cholesterol, HDL-C, blood pressure, diabetes history and smoking history. The results showed that the output from the text mining system was reliable, but there was a significant amount of missing data to calculate the Framingham risk score. A systematic approach for understanding missing data was followed by implementation of imputation strategies. An analysis of the 10-year Framingham risk scores for coronary artery disease in this cohort has shown that the majority of the diabetic patients are at moderate risk of CAD.

摘要

冠状动脉疾病(CAD)常导致心肌梗死,这可能是致命的。风险因素可用于预测CAD,进而可能实现预防或早期干预。确定一种疾病的风险因素需要患者数据,如合并症、用药史、社会史和家族史。然而,如果数据不是专门为风险评估目的收集的,风险因素数据通常会嵌入非结构化的临床叙述中。临床文本挖掘可用于从非结构化临床记录中提取与风险因素相关的数据。本研究提出了利用临床文本挖掘从非结构化电子健康记录中提取弗明汉风险因素,并在一组糖尿病患者中计算10年冠状动脉疾病风险评分的方法。我们开发了一个基于规则的系统来提取风险因素:年龄、性别、总胆固醇、高密度脂蛋白胆固醇、血压、糖尿病史和吸烟史。结果表明,文本挖掘系统的输出是可靠的,但计算弗明汉风险评分存在大量缺失数据。在实施插补策略之前,先采用了一种系统的方法来理解缺失数据。对该队列中冠状动脉疾病的10年弗明汉风险评分分析表明,大多数糖尿病患者处于CAD的中度风险。

相似文献

引用本文的文献

本文引用的文献

3
Prediction of hospitalization due to heart diseases by supervised learning methods.采用监督学习方法预测心脏病住院情况。
Int J Med Inform. 2015 Mar;84(3):189-97. doi: 10.1016/j.ijmedinf.2014.10.002. Epub 2014 Oct 16.
5
Towards actionable risk stratification: a bilinear approach.迈向可操作的风险分层:一种双线性方法。
J Biomed Inform. 2015 Feb;53:147-55. doi: 10.1016/j.jbi.2014.10.004. Epub 2014 Oct 14.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验