支持语义分析的临床自然语言处理的最新进展。
Recent Advances in Clinical Natural Language Processing in Support of Semantic Analysis.
作者信息
Velupillai S, Mowery D, South B R, Kvist M, Dalianis H
机构信息
Sumithra Velupillai, Department of Computer and Systems Sciences, Stockholm University, Postbox 7003, 164 07 Kista, Sweden, Tel: +46 8 161 174, Fax: +46 8 703 9025, E-mail:
出版信息
Yearb Med Inform. 2015 Aug 13;10(1):183-93. doi: 10.15265/IY-2015-009.
OBJECTIVES
We present a review of recent advances in clinical Natural Language Processing (NLP), with a focus on semantic analysis and key subtasks that support such analysis.
METHODS
We conducted a literature review of clinical NLP research from 2008 to 2014, emphasizing recent publications (2012-2014), based on PubMed and ACL proceedings as well as relevant referenced publications from the included papers.
RESULTS
Significant articles published within this time-span were included and are discussed from the perspective of semantic analysis. Three key clinical NLP subtasks that enable such analysis were identified: 1) developing more efficient methods for corpus creation (annotation and de-identification), 2) generating building blocks for extracting meaning (morphological, syntactic, and semantic subtasks), and 3) leveraging NLP for clinical utility (NLP applications and infrastructure for clinical use cases). Finally, we provide a reflection upon most recent developments and potential areas of future NLP development and applications.
CONCLUSIONS
There has been an increase of advances within key NLP subtasks that support semantic analysis. Performance of NLP semantic analysis is, in many cases, close to that of agreement between humans. The creation and release of corpora annotated with complex semantic information models has greatly supported the development of new tools and approaches. Research on non-English languages is continuously growing. NLP methods have sometimes been successfully employed in real-world clinical tasks. However, there is still a gap between the development of advanced resources and their utilization in clinical settings. A plethora of new clinical use cases are emerging due to established health care initiatives and additional patient-generated sources through the extensive use of social media and other devices.
目标
我们对临床自然语言处理(NLP)的最新进展进行综述,重点关注语义分析以及支持此类分析的关键子任务。
方法
我们基于PubMed和ACL会议论文集以及所纳入论文的相关参考文献,对2008年至2014年的临床NLP研究进行了文献综述,重点关注近期出版物(2012 - 2014年)。
结果
纳入了该时间段内发表的重要文章,并从语义分析的角度进行了讨论。确定了实现此类分析的三个关键临床NLP子任务:1)开发更有效的语料库创建方法(注释和去识别),2)生成用于提取意义的构建块(形态学、句法和语义子任务),以及3)将NLP用于临床应用(针对临床用例的NLP应用和基础设施)。最后,我们对NLP的最新发展以及未来NLP发展和应用的潜在领域进行了思考。
结论
在支持语义分析的关键NLP子任务方面取得了更多进展。在许多情况下,NLP语义分析的性能接近于人类之间的一致性。带有复杂语义信息模型注释的语料库的创建和发布极大地支持了新工具和方法的开发。对非英语语言的研究在不断增加。NLP方法有时已成功应用于实际临床任务。然而,先进资源的开发与它们在临床环境中的利用之间仍存在差距。由于既定的医疗保健计划以及通过广泛使用社交媒体和其他设备产生额外的患者生成源,大量新的临床用例正在涌现。