Suppr超能文献

使用条件随机场和长短时记忆网络对病历进行去识别。

De-identification of medical records using conditional random fields and long short-term memory networks.

机构信息

School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang 150001, China.

出版信息

J Biomed Inform. 2017 Nov;75S:S43-S53. doi: 10.1016/j.jbi.2017.10.003. Epub 2017 Oct 13.

Abstract

The CEGS N-GRID 2016 Shared Task 1 in Clinical Natural Language Processing focuses on the de-identification of psychiatric evaluation records. This paper describes two participating systems of our team, based on conditional random fields (CRFs) and long short-term memory networks (LSTMs). A pre-processing module was introduced for sentence detection and tokenization before de-identification. For CRFs, manually extracted rich features were utilized to train the model. For LSTMs, a character-level bi-directional LSTM network was applied to represent tokens and classify tags for each token, following which a decoding layer was stacked to decode the most probable protected health information (PHI) terms. The LSTM-based system attained an i2b2 strict micro-F measure of 0.8986, which was higher than that of the CRF-based system.

摘要

CEGS N-GRID 2016 临床自然语言处理共享任务 1 专注于精神科评估记录的去识别化。本文描述了我们团队的两个参赛系统,基于条件随机场 (CRFs) 和长短时记忆网络 (LSTMs)。在去识别化之前,引入了一个预处理模块进行句子检测和标记。对于 CRFs,我们利用手动提取的丰富特征来训练模型。对于 LSTMs,我们应用了字符级别的双向 LSTM 网络来表示标记,并为每个标记分类标签,然后堆叠解码层来解码最可能的受保护健康信息 (PHI) 项。基于 LSTM 的系统在 i2b2 严格微观 F 度量上达到了 0.8986,高于基于 CRF 的系统。

相似文献

4
CRFs based de-identification of medical records.基于病例报告表的医疗记录去识别化处理。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S39-S46. doi: 10.1016/j.jbi.2015.08.012. Epub 2015 Aug 24.
6
Entity recognition from clinical texts via recurrent neural network.基于循环神经网络的临床文本实体识别。
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):67. doi: 10.1186/s12911-017-0468-7.

引用本文的文献

本文引用的文献

3
LSTM: A Search Space Odyssey.长短期记忆网络:搜索空间奥德赛。
IEEE Trans Neural Netw Learn Syst. 2017 Oct;28(10):2222-2232. doi: 10.1109/TNNLS.2016.2582924. Epub 2016 Jul 8.
4
CRFs based de-identification of medical records.基于病例报告表的医疗记录去识别化处理。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S39-S46. doi: 10.1016/j.jbi.2015.08.012. Epub 2015 Aug 24.
5
Automatic detection of protected health information from clinic narratives.从临床记录中自动检测受保护的健康信息。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S30-S38. doi: 10.1016/j.jbi.2015.06.015. Epub 2015 Jul 29.
7
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
9
Representation learning: a review and new perspectives.表示学习:综述与新视角。
IEEE Trans Pattern Anal Mach Intell. 2013 Aug;35(8):1798-828. doi: 10.1109/TPAMI.2013.50.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验