• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一项使用基于注意力的深度神经阅读器进行生物医学文本理解的初步研究:设计与实验分析。

A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis.

作者信息

Kim Seongsoon, Park Donghyeon, Choi Yonghwa, Lee Kyubum, Kim Byounggun, Jeon Minji, Kim Jihye, Tan Aik Choon, Kang Jaewoo

机构信息

Department of Computer Science and Engineering, College of Informatics, Korea University, Seoul, Republic Of Korea.

Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul, Republic Of Korea.

出版信息

JMIR Med Inform. 2018 Jan 5;6(1):e2. doi: 10.2196/medinform.8751.

DOI:10.2196/medinform.8751
PMID:29305341
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5783222/
Abstract

BACKGROUND

With the development of artificial intelligence (AI) technology centered on deep-learning, the computer has evolved to a point where it can read a given text and answer a question based on the context of the text. Such a specific task is known as the task of machine comprehension. Existing machine comprehension tasks mostly use datasets of general texts, such as news articles or elementary school-level storybooks. However, no attempt has been made to determine whether an up-to-date deep learning-based machine comprehension model can also process scientific literature containing expert-level knowledge, especially in the biomedical domain.

OBJECTIVE

This study aims to investigate whether a machine comprehension model can process biomedical articles as well as general texts. Since there is no dataset for the biomedical literature comprehension task, our work includes generating a large-scale question answering dataset using PubMed and manually evaluating the generated dataset.

METHODS

We present an attention-based deep neural model tailored to the biomedical domain. To further enhance the performance of our model, we used a pretrained word vector and biomedical entity type embedding. We also developed an ensemble method of combining the results of several independent models to reduce the variance of the answers from the models.

RESULTS

The experimental results showed that our proposed deep neural network model outperformed the baseline model by more than 7% on the new dataset. We also evaluated human performance on the new dataset. The human evaluation result showed that our deep neural model outperformed humans in comprehension by 22% on average.

CONCLUSIONS

In this work, we introduced a new task of machine comprehension in the biomedical domain using a deep neural model. Since there was no large-scale dataset for training deep neural models in the biomedical domain, we created the new cloze-style datasets Biomedical Knowledge Comprehension Title (BMKC_T) and Biomedical Knowledge Comprehension Last Sentence (BMKC_LS) (together referred to as BioMedical Knowledge Comprehension) using the PubMed corpus. The experimental results showed that the performance of our model is much higher than that of humans. We observed that our model performed consistently better regardless of the degree of difficulty of a text, whereas humans have difficulty when performing biomedical literature comprehension tasks that require expert level knowledge.

摘要

背景

随着以深度学习为核心的人工智能(AI)技术的发展,计算机已经发展到能够读取给定文本并根据文本上下文回答问题的程度。这样的特定任务被称为机器理解任务。现有的机器理解任务大多使用通用文本数据集,如新闻文章或小学水平的故事书。然而,尚未有人尝试确定基于深度学习的最新机器理解模型是否也能处理包含专家级知识的科学文献,尤其是在生物医学领域。

目的

本研究旨在调查机器理解模型是否能够像处理通用文本一样处理生物医学文章。由于没有用于生物医学文献理解任务的数据集,我们的工作包括使用PubMed生成一个大规模问答数据集并对生成的数据集进行人工评估。

方法

我们提出了一种针对生物医学领域量身定制的基于注意力的深度神经模型。为了进一步提高我们模型的性能,我们使用了预训练词向量和生物医学实体类型嵌入。我们还开发了一种将几个独立模型的结果相结合的集成方法,以减少模型答案的方差。

结果

实验结果表明,我们提出的深度神经网络模型在新数据集上比基线模型性能高出7%以上。我们还评估了人类在新数据集上的表现。人工评估结果表明,我们的深度神经模型在理解方面平均比人类高出22%。

结论

在这项工作中,我们使用深度神经模型引入了生物医学领域的一项新的机器理解任务。由于在生物医学领域没有用于训练深度神经模型的大规模数据集,我们使用PubMed语料库创建了新的完形填空式数据集生物医学知识理解标题(BMKC_T)和生物医学知识理解最后一句(BMKC_LS)(统称为生物医学知识理解)。实验结果表明,我们模型的性能远高于人类。我们观察到,无论文本难度如何,我们的模型表现始终更好,而人类在执行需要专家级知识的生物医学文献理解任务时存在困难。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1de2/5783222/f2230f7f158a/medinform_v6i1e2_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1de2/5783222/d98cf36ef577/medinform_v6i1e2_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1de2/5783222/ae3545688e00/medinform_v6i1e2_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1de2/5783222/f2230f7f158a/medinform_v6i1e2_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1de2/5783222/d98cf36ef577/medinform_v6i1e2_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1de2/5783222/ae3545688e00/medinform_v6i1e2_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1de2/5783222/f2230f7f158a/medinform_v6i1e2_fig3.jpg

相似文献

1
A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis.一项使用基于注意力的深度神经阅读器进行生物医学文本理解的初步研究:设计与实验分析。
JMIR Med Inform. 2018 Jan 5;6(1):e2. doi: 10.2196/medinform.8751.
2
Clinical Context-Aware Biomedical Text Summarization Using Deep Neural Network: Model Development and Validation.基于深度神经网络的临床相关生物医学文本摘要:模型开发与验证。
J Med Internet Res. 2020 Oct 23;22(10):e19810. doi: 10.2196/19810.
3
Analysis of English Multitext Reading Comprehension Model Based on Deep Belief Neural Network.基于深度置信神经网络的英语多文本阅读理解模型分析。
Comput Intell Neurosci. 2021 Sep 15;2021:5100809. doi: 10.1155/2021/5100809. eCollection 2021.
4
Named Entity Aware Transfer Learning for Biomedical Factoid Question Answering.命名实体感知迁移学习在生物医学事实问答中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2365-2376. doi: 10.1109/TCBB.2021.3079339. Epub 2022 Aug 8.
5
A neural network multi-task learning approach to biomedical named entity recognition.一种用于生物医学命名实体识别的神经网络多任务学习方法。
BMC Bioinformatics. 2017 Aug 15;18(1):368. doi: 10.1186/s12859-017-1776-8.
6
BioADAPT-MRC: adversarial learning-based domain adaptation improves biomedical machine reading comprehension task.BioADAPT-MRC:基于对抗学习的领域自适应提高生物医学机器阅读理解任务。
Bioinformatics. 2022 Sep 15;38(18):4369-4379. doi: 10.1093/bioinformatics/btac508.
7
Efficient Machine Reading Comprehension for Health Care Applications: Algorithm Development and Validation of a Context Extraction Approach.用于医疗保健应用的高效机器阅读理解:上下文提取方法的算法开发与验证
JMIR Form Res. 2024 Mar 25;8:e52482. doi: 10.2196/52482.
8
GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text.GRAM-CNN:一种基于局部上下文的深度学习方法,用于生物医学文本中的命名实体识别。
Bioinformatics. 2018 May 1;34(9):1547-1554. doi: 10.1093/bioinformatics/btx815.
9
Deep learning with sentence embeddings pre-trained on biomedical corpora improves the performance of finding similar sentences in electronic medical records.基于生物医学语料库预训练的句子嵌入的深度学习提高了在电子病历中查找相似句子的性能。
BMC Med Inform Decis Mak. 2020 Apr 30;20(Suppl 1):73. doi: 10.1186/s12911-020-1044-0.
10
A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.

引用本文的文献

1
Sequence tagging for biomedical extractive question answering.生物医学抽取式问答的序列标注。
Bioinformatics. 2022 Aug 2;38(15):3794-3801. doi: 10.1093/bioinformatics/btac397.
2
SentiMedQAer: A Transfer Learning-Based Sentiment-Aware Model for Biomedical Question Answering.SentiMedQAer:一种基于迁移学习的生物医学问答情感感知模型。
Front Neurorobot. 2022 Mar 10;16:773329. doi: 10.3389/fnbot.2022.773329. eCollection 2022.
3
Using FHIR to Construct a Corpus of Clinical Questions Annotated with Logical Forms and Answers.

本文引用的文献

1
Validation of an Improved Computer-Assisted Technique for Mining Free-Text Electronic Medical Records.一种改进的用于挖掘自由文本电子病历的计算机辅助技术的验证
JMIR Med Inform. 2017 Jun 29;5(2):e17. doi: 10.2196/medinform.7123.
2
BEST: Next-Generation Biomedical Entity Search Tool for Knowledge Discovery from Biomedical Literature.BEST:用于从生物医学文献中进行知识发现的下一代生物医学实体搜索工具。
PLoS One. 2016 Oct 19;11(10):e0164680. doi: 10.1371/journal.pone.0164680. eCollection 2016.
3
A Semi-Supervised Learning Approach to Enhance Health Care Community-Based Question Answering: A Case Study in Alcoholism.
使用FHIR构建一个带有逻辑形式和答案注释的临床问题语料库。
AMIA Annu Symp Proc. 2020 Mar 4;2019:1207-1215. eCollection 2019.
一种基于半监督学习的方法,用于增强医疗保健社区问答:以酗酒为例的研究。
JMIR Med Inform. 2016 Aug 2;4(3):e24. doi: 10.2196/medinform.5490.
4
Deep Question Answering for protein annotation.用于蛋白质注释的深度问答
Database (Oxford). 2015 Sep 16;2015. doi: 10.1093/database/bav081. Print 2015.
5
A framework for ontology-based question answering with application to parasite immunology.一个基于本体的问答框架及其在寄生虫免疫学中的应用。
J Biomed Semantics. 2015 Jul 17;6:31. doi: 10.1186/s13326-015-0029-x. eCollection 2015.
6
Community challenges in biomedical text mining over 10 years: success, failure and the future.十年来生物医学文本挖掘中的社区挑战:成功、失败与未来。
Brief Bioinform. 2016 Jan;17(1):132-44. doi: 10.1093/bib/bbv024. Epub 2015 May 1.
7
An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition.BIOASQ大规模生物医学语义索引与问答竞赛概述。
BMC Bioinformatics. 2015 Apr 30;16:138. doi: 10.1186/s12859-015-0564-6.
8
Application of text mining in the biomedical domain.文本挖掘在生物医学领域的应用。
Methods. 2015 Mar;74:97-106. doi: 10.1016/j.ymeth.2015.01.015. Epub 2015 Jan 30.
9
Biomedical question answering using semantic relations.基于语义关系的生物医学问答
BMC Bioinformatics. 2015 Jan 16;16(1):6. doi: 10.1186/s12859-014-0365-3.
10
Question answering for biology.生物学问答。
Methods. 2015 Mar;74:36-46. doi: 10.1016/j.ymeth.2014.10.023. Epub 2014 Oct 28.