Suppr超能文献

电子病历的数据处理和文本挖掘技术:综述。

Data Processing and Text Mining Technologies on Electronic Medical Records: A Review.

机构信息

College of Computer, National University of Defense Technology, Changsha 410073, China.

Innovation Center, China Academy of Electronics and Information Technology, Beijing 100041, China.

出版信息

J Healthc Eng. 2018 Apr 8;2018:4302425. doi: 10.1155/2018/4302425. eCollection 2018.

Abstract

Currently, medical institutes generally use EMR to record patient's condition, including diagnostic information, procedures performed, and treatment results. EMR has been recognized as a valuable resource for large-scale analysis. However, EMR has the characteristics of diversity, incompleteness, redundancy, and privacy, which make it difficult to carry out data mining and analysis directly. Therefore, it is necessary to preprocess the source data in order to improve data quality and improve the data mining results. Different types of data require different processing technologies. Most structured data commonly needs classic preprocessing technologies, including data cleansing, data integration, data transformation, and data reduction. For semistructured or unstructured data, such as medical text, containing more health information, it requires more complex and challenging processing methods. The task of information extraction for medical texts mainly includes NER (named-entity recognition) and RE (relation extraction). This paper focuses on the process of EMR processing and emphatically analyzes the key techniques. In addition, we make an in-depth study on the applications developed based on text mining together with the open challenges and research issues for future work.

摘要

目前,医疗机构通常使用电子病历 (EMR) 来记录患者的病情,包括诊断信息、所进行的程序以及治疗结果。EMR 已被公认为进行大规模分析的有价值资源。然而,EMR 具有多样性、不完整性、冗余性和隐私性等特点,使得直接进行数据挖掘和分析变得困难。因此,有必要对源数据进行预处理,以提高数据质量并改善数据挖掘结果。不同类型的数据需要不同的处理技术。大多数结构化数据通常需要经典的预处理技术,包括数据清理、数据集成、数据转换和数据缩减。对于半结构化或非结构化数据,例如包含更多健康信息的医疗文本,则需要更复杂和具有挑战性的处理方法。医疗文本的信息提取任务主要包括命名实体识别 (NER) 和关系提取 (RE)。本文重点介绍了 EMR 处理的过程,并着重分析了关键技术。此外,我们还深入研究了基于文本挖掘的应用程序开发,以及未来工作的开放挑战和研究问题。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1601/5911323/25ef4e005ad1/JHE2018-4302425.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验