Suppr超能文献

[利用文本挖掘从临床常规数据中提取特征]

[Extraction of features from clinical routine data using text mining].

作者信息

Grundel Bastian, Bernardeau Marc-Antoine, Langner Holger, Schmidt Christoph, Böhringer Daniel, Ritter Marc, Rosenthal Paul, Grandjean Andrea, Schulz Stefan, Daumke Philipp, Stahl Andreas

机构信息

Klinik und Poliklinik für Augenheilkunde, Universitätsmedizin Greifswald, Greifswald, Deutschland.

Professur Medieninformatik, Hochschule Mittweida, Mittweida, Deutschland.

出版信息

Ophthalmologe. 2021 Mar;118(3):264-272. doi: 10.1007/s00347-020-01177-4.

Abstract

BACKGROUND

Anti-VEGF drugs are currently used to treat macular diseases. This has led to a wealth of additional data, which could help understand and predict treatment courses; however, this information is usually only available in free text form.

OBJECTIVE

A retrospective study was designed to analyze how far interpretable information can be obtained from clinical texts by automated extraction. The aim was to assess the suitability of a text mining method that was customized for this purpose.

MATERIAL AND METHODS

Data on 3683 patients were available, including 40,485 discharge letters. Some of the data of interest, e.g. visual acuity (VA), intraocular pressure (IOP) and accompanying diagnoses, were not only recorded textually but also entered in a database and could thus serve as a gold standard for text analysis. The text was analyzed using the Averbis Health Discovery text mining platform. To optimize the extraction task, rule knowledge and a German language technical vocabulary linked to the international medical terminology standard systematized nomenclature of medicine (SNOMED CT) was manually added.

RESULTS

The correspondence between extracted data and the structured database entries is described by the F1 value. There was agreement of 94.7% for VA, 98.3% for IOP and 94.7% for the accompanying diagnoses. Manual analysis of noncorresponding cases showed that in 50% text content did not match the database content for various reasons. After an adjustment, F1 values 1-3% above the previously determined values were obtained.

CONCLUSION

Text mining procedures are very well suited for the considered discharge letter corpus and the problem described in order to extract contents from clinical texts in a structured manner for further evaluation.

摘要

背景

抗血管内皮生长因子(VEGF)药物目前用于治疗黄斑疾病。这产生了大量额外数据,有助于理解和预测治疗过程;然而,这些信息通常仅以自由文本形式提供。

目的

设计一项回顾性研究,以分析通过自动提取从临床文本中可获得的可解释信息的程度。目的是评估为此目的定制的文本挖掘方法的适用性。

材料与方法

有3683例患者的数据,包括40485份出院小结。一些感兴趣的数据,如视力(VA)、眼压(IOP)和伴随诊断,不仅以文本形式记录,还录入了数据库,因此可作为文本分析的金标准。使用Averbis Health Discovery文本挖掘平台对文本进行分析。为优化提取任务,手动添加了规则知识和与国际医学术语标准系统化医学命名法(SNOMED CT)相关的德语技术词汇。

结果

提取数据与结构化数据库条目的对应关系用F1值描述。VA的一致性为94.7%,IOP为98.3%,伴随诊断为94.7%。对不相符病例的人工分析表明,50%的文本内容因各种原因与数据库内容不匹配。调整后,获得的F1值比先前确定的值高1%-3%。

结论

文本挖掘程序非常适合所考虑的出院小结语料库以及所描述的问题,以便以结构化方式从临床文本中提取内容进行进一步评估。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验