Kim Seongsoon, Park Donghyeon, Choi Yonghwa, Lee Kyubum, Kim Byounggun, Jeon Minji, Kim Jihye, Tan Aik Choon, Kang Jaewoo
Department of Computer Science and Engineering, College of Informatics, Korea University, Seoul, Republic Of Korea.
Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul, Republic Of Korea.
JMIR Med Inform. 2018 Jan 5;6(1):e2. doi: 10.2196/medinform.8751.
With the development of artificial intelligence (AI) technology centered on deep-learning, the computer has evolved to a point where it can read a given text and answer a question based on the context of the text. Such a specific task is known as the task of machine comprehension. Existing machine comprehension tasks mostly use datasets of general texts, such as news articles or elementary school-level storybooks. However, no attempt has been made to determine whether an up-to-date deep learning-based machine comprehension model can also process scientific literature containing expert-level knowledge, especially in the biomedical domain.
This study aims to investigate whether a machine comprehension model can process biomedical articles as well as general texts. Since there is no dataset for the biomedical literature comprehension task, our work includes generating a large-scale question answering dataset using PubMed and manually evaluating the generated dataset.
We present an attention-based deep neural model tailored to the biomedical domain. To further enhance the performance of our model, we used a pretrained word vector and biomedical entity type embedding. We also developed an ensemble method of combining the results of several independent models to reduce the variance of the answers from the models.
The experimental results showed that our proposed deep neural network model outperformed the baseline model by more than 7% on the new dataset. We also evaluated human performance on the new dataset. The human evaluation result showed that our deep neural model outperformed humans in comprehension by 22% on average.
In this work, we introduced a new task of machine comprehension in the biomedical domain using a deep neural model. Since there was no large-scale dataset for training deep neural models in the biomedical domain, we created the new cloze-style datasets Biomedical Knowledge Comprehension Title (BMKC_T) and Biomedical Knowledge Comprehension Last Sentence (BMKC_LS) (together referred to as BioMedical Knowledge Comprehension) using the PubMed corpus. The experimental results showed that the performance of our model is much higher than that of humans. We observed that our model performed consistently better regardless of the degree of difficulty of a text, whereas humans have difficulty when performing biomedical literature comprehension tasks that require expert level knowledge.
随着以深度学习为核心的人工智能(AI)技术的发展,计算机已经发展到能够读取给定文本并根据文本上下文回答问题的程度。这样的特定任务被称为机器理解任务。现有的机器理解任务大多使用通用文本数据集,如新闻文章或小学水平的故事书。然而,尚未有人尝试确定基于深度学习的最新机器理解模型是否也能处理包含专家级知识的科学文献,尤其是在生物医学领域。
本研究旨在调查机器理解模型是否能够像处理通用文本一样处理生物医学文章。由于没有用于生物医学文献理解任务的数据集,我们的工作包括使用PubMed生成一个大规模问答数据集并对生成的数据集进行人工评估。
我们提出了一种针对生物医学领域量身定制的基于注意力的深度神经模型。为了进一步提高我们模型的性能,我们使用了预训练词向量和生物医学实体类型嵌入。我们还开发了一种将几个独立模型的结果相结合的集成方法,以减少模型答案的方差。
实验结果表明,我们提出的深度神经网络模型在新数据集上比基线模型性能高出7%以上。我们还评估了人类在新数据集上的表现。人工评估结果表明,我们的深度神经模型在理解方面平均比人类高出22%。
在这项工作中,我们使用深度神经模型引入了生物医学领域的一项新的机器理解任务。由于在生物医学领域没有用于训练深度神经模型的大规模数据集,我们使用PubMed语料库创建了新的完形填空式数据集生物医学知识理解标题(BMKC_T)和生物医学知识理解最后一句(BMKC_LS)(统称为生物医学知识理解)。实验结果表明,我们模型的性能远高于人类。我们观察到,无论文本难度如何,我们的模型表现始终更好,而人类在执行需要专家级知识的生物医学文献理解任务时存在困难。