Jagannatha Abhyuday N, Yu Hong
University of Massachusetts, MA, USA.
University of Massachusetts, MA, USA; Bedford VAMC and CHOIR, MA, USA.
Proc Conf Empir Methods Nat Lang Process. 2016 Nov;2016:856-865. doi: 10.18653/v1/d16-1082.
Sequence labeling is a widely used method for named entity recognition and information extraction from unstructured natural language data. In clinical domain one major application of sequence labeling involves extraction of medical entities such as medication, indication, and side-effects from Electronic Health Record narratives. Sequence labeling in this domain, presents its own set of challenges and objectives. In this work we experimented with various CRF based structured learning models with Recurrent Neural Networks. We extend the previously studied LSTM-CRF models with explicit modeling of pairwise potentials. We also propose an approximate version of skip-chain CRF inference with RNN potentials. We use these methodologies for structured prediction in order to improve the exact phrase detection of various medical entities.
序列标注是一种广泛应用于从非结构化自然语言数据中进行命名实体识别和信息提取的方法。在临床领域,序列标注的一个主要应用涉及从电子健康记录叙述中提取医学实体,如药物、适应症和副作用。该领域的序列标注有其自身的一系列挑战和目标。在这项工作中,我们用基于循环神经网络的各种条件随机场(CRF)结构化学习模型进行了实验。我们通过显式建模成对势来扩展先前研究的长短期记忆网络 - 条件随机场(LSTM - CRF)模型。我们还提出了一种带有循环神经网络势的近似跳链条件随机场推理版本。我们使用这些方法进行结构化预测,以改进各种医学实体的精确短语检测。