Suppr超能文献

电子病历中多发性硬化症临床特征的自动提取。

Automated extraction of clinical traits of multiple sclerosis in electronic medical records.

机构信息

Center for Human Genetics Research, Vanderbilt University Medical Center, Nashville, Tennessee, USA.

出版信息

J Am Med Inform Assoc. 2013 Dec;20(e2):e334-40. doi: 10.1136/amiajnl-2013-001999. Epub 2013 Oct 22.

Abstract

OBJECTIVES

The clinical course of multiple sclerosis (MS) is highly variable, and research data collection is costly and time consuming. We evaluated natural language processing techniques applied to electronic medical records (EMR) to identify MS patients and the key clinical traits of their disease course.

MATERIALS AND METHODS

We used four algorithms based on ICD-9 codes, text keywords, and medications to identify individuals with MS from a de-identified, research version of the EMR at Vanderbilt University. Using a training dataset of the records of 899 individuals, algorithms were constructed to identify and extract detailed information regarding the clinical course of MS from the text of the medical records, including clinical subtype, presence of oligoclonal bands, year of diagnosis, year and origin of first symptom, Expanded Disability Status Scale (EDSS) scores, timed 25-foot walk scores, and MS medications. Algorithms were evaluated on a test set validated by two independent reviewers.

RESULTS

We identified 5789 individuals with MS. For all clinical traits extracted, precision was at least 87% and specificity was greater than 80%. Recall values for clinical subtype, EDSS scores, and timed 25-foot walk scores were greater than 80%.

DISCUSSION AND CONCLUSION

This collection of clinical data represents one of the largest databases of detailed, clinical traits available for research on MS. This work demonstrates that detailed clinical information is recorded in the EMR and can be extracted for research purposes with high reliability.

摘要

目的

多发性硬化症(MS)的临床病程变化多样,且研究数据的采集既昂贵又耗时。我们评估了应用于电子病历(EMR)的自然语言处理技术,以识别 MS 患者及其疾病过程的关键临床特征。

材料与方法

我们使用了基于 ICD-9 编码、文本关键字和药物的四种算法,从范德比尔特大学的去识别研究版 EMR 中识别 MS 患者。使用 899 名个体的记录训练数据集,构建算法以从医疗记录的文本中识别和提取 MS 临床病程的详细信息,包括临床亚型、寡克隆带的存在、诊断年份、首发症状的年份和来源、扩展残疾状态量表(EDSS)评分、定时 25 英尺步行评分以及 MS 药物。算法在由两名独立审查员验证的测试集中进行了评估。

结果

我们确定了 5789 名 MS 患者。对于提取的所有临床特征,精度至少为 87%,特异性大于 80%。临床亚型、EDSS 评分和定时 25 英尺步行评分的召回值大于 80%。

讨论与结论

该临床数据集代表了可用于 MS 研究的最大详细临床特征数据库之一。这项工作表明,详细的临床信息记录在 EMR 中,并可通过高度可靠的方式提取用于研究目的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57f7/3861927/8590fad1de48/amiajnl-2013-001999f01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验