Suppr超能文献

捕捉患者视角:健康相关文本自然语言处理进展综述

Capturing the Patient's Perspective: a Review of Advances in Natural Language Processing of Health-Related Text.

作者信息

Gonzalez-Hernandez G, Sarker A, O'Connor K, Savova G

出版信息

Yearb Med Inform. 2017 Aug;26(1):214-227. doi: 10.15265/IY-2017-029. Epub 2017 Sep 11.

Abstract

Natural Language Processing (NLP) methods are increasingly being utilized to mine knowledge from unstructured health-related texts. Recent advances in noisy text processing techniques are enabling researchers and medical domain experts to go beyond the information encapsulated in published texts (e.g., clinical trials and systematic reviews) and structured questionnaires, and obtain perspectives from other unstructured sources such as Electronic Health Records (EHRs) and social media posts. To review the recently published literature discussing the application of NLP techniques for mining health-related information from EHRs and social media posts. Literature review included the research published over the last five years based on searches of PubMed, conference proceedings, and the ACM Digital Library, as well as on relevant publications referenced in papers. We particularly focused on the techniques employed on EHRs and social media data. A set of 62 studies involving EHRs and 87 studies involving social media matched our criteria and were included in this paper. We present the purposes of these studies, outline the key NLP contributions, and discuss the general trends observed in the field, the current state of research, and important outstanding problems. Over the recent years, there has been a continuing transition from lexical and rule-based systems to learning-based approaches, because of the growth of annotated data sets and advances in data science. For EHRs, publicly available annotated data is still scarce and this acts as an obstacle to research progress. On the contrary, research on social media mining has seen a rapid growth, particularly because the large amount of unlabeled data available via this resource compensates for the uncertainty inherent to the data. Effective mechanisms to filter out noise and for mapping social media expressions to standard medical concepts are crucial and latent research problems. Shared tasks and other competitive challenges have been driving factors behind the implementation of open systems, and they are likely to play an imperative role in the development of future systems.

摘要

自然语言处理(NLP)方法越来越多地被用于从非结构化的健康相关文本中挖掘知识。噪声文本处理技术的最新进展使研究人员和医学领域专家能够超越已发表文本(如临床试验和系统评价)以及结构化问卷中所包含的信息,并从其他非结构化来源获取观点,如电子健康记录(EHRs)和社交媒体帖子。为了回顾最近发表的讨论NLP技术在从EHRs和社交媒体帖子中挖掘健康相关信息方面应用的文献。文献综述包括过去五年基于对PubMed、会议论文集和ACM数字图书馆的搜索以及论文中引用的相关出版物发表的研究。我们特别关注EHRs和社交媒体数据所采用的技术。一组62项涉及EHRs的研究和87项涉及社交媒体的研究符合我们的标准,并被纳入本文。我们介绍了这些研究的目的,概述了关键的NLP贡献,并讨论了该领域观察到的总体趋势、研究现状和重要的突出问题。近年来,由于注释数据集的增长和数据科学的进步,从基于词汇和规则的系统到基于学习的方法一直在持续转变。对于EHRs,公开可用的注释数据仍然稀缺,这成为研究进展的障碍。相反,社交媒体挖掘的研究发展迅速,特别是因为通过该资源可获得的大量未标记数据弥补了数据固有的不确定性。过滤噪声以及将社交媒体表达映射到标准医学概念的有效机制是关键且潜在的研究问题。共享任务和其他竞争性挑战一直是开放系统实施背后的驱动因素,并且它们可能在未来系统的开发中发挥至关重要的作用。

相似文献

引用本文的文献

本文引用的文献

4
Twitter as a Tool for Health Research: A Systematic Review.推特作为健康研究工具:一项系统综述
Am J Public Health. 2017 Jan;107(1):e1-e8. doi: 10.2105/AJPH.2016.303512. Epub 2016 Nov 17.
8
Using machine learning to parse breast pathology reports.使用机器学习解析乳腺病理报告。
Breast Cancer Res Treat. 2017 Jan;161(2):203-211. doi: 10.1007/s10549-016-4035-1. Epub 2016 Nov 8.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验