• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

BioADAPT-MRC:基于对抗学习的领域自适应提高生物医学机器阅读理解任务。

BioADAPT-MRC: adversarial learning-based domain adaptation improves biomedical machine reading comprehension task.

机构信息

Department of Electrical Engineering and Computer Science, University of Tennessee, Knoxville, TN 37996, USA.

Cyber Resilience and Intelligence Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA.

出版信息

Bioinformatics. 2022 Sep 15;38(18):4369-4379. doi: 10.1093/bioinformatics/btac508.

DOI:10.1093/bioinformatics/btac508
PMID:35876792
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9477526/
Abstract

MOTIVATION

Biomedical machine reading comprehension (biomedical-MRC) aims to comprehend complex biomedical narratives and assist healthcare professionals in retrieving information from them. The high performance of modern neural network-based MRC systems depends on high-quality, large-scale, human-annotated training datasets. In the biomedical domain, a crucial challenge in creating such datasets is the requirement for domain knowledge, inducing the scarcity of labeled data and the need for transfer learning from the labeled general-purpose (source) domain to the biomedical (target) domain. However, there is a discrepancy in marginal distributions between the general-purpose and biomedical domains due to the variances in topics. Therefore, direct-transferring of learned representations from a model trained on a general-purpose domain to the biomedical domain can hurt the model's performance.

RESULTS

We present an adversarial learning-based domain adaptation framework for the biomedical machine reading comprehension task (BioADAPT-MRC), a neural network-based method to address the discrepancies in the marginal distributions between the general and biomedical domain datasets. BioADAPT-MRC relaxes the need for generating pseudo labels for training a well-performing biomedical-MRC model. We extensively evaluate the performance of BioADAPT-MRC by comparing it with the best existing methods on three widely used benchmark biomedical-MRC datasets-BioASQ-7b, BioASQ-8b and BioASQ-9b. Our results suggest that without using any synthetic or human-annotated data from the biomedical domain, BioADAPT-MRC can achieve state-of-the-art performance on these datasets.

AVAILABILITY AND IMPLEMENTATION

BioADAPT-MRC is freely available as an open-source project at https://github.com/mmahbub/BioADAPT-MRC.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

生物医学机器阅读理解(biomedical-MRC)旨在理解复杂的生物医学叙述,并帮助医疗保健专业人员从中检索信息。现代基于神经网络的 MRC 系统的高性能依赖于高质量、大规模、人工标注的训练数据集。在生物医学领域,创建此类数据集的一个关键挑战是需要领域知识,这导致了标记数据的稀缺性,并需要从标记的通用(源)领域到生物医学(目标)领域进行迁移学习。然而,由于主题的差异,通用领域和生物医学领域之间的边缘分布存在差异。因此,直接将从通用领域训练的模型中学习到的表示转移到生物医学领域可能会损害模型的性能。

结果

我们提出了一种基于对抗学习的生物医学机器阅读理解任务的域自适应框架(BioADAPT-MRC),这是一种基于神经网络的方法,可以解决通用和生物医学领域数据集之间边缘分布差异的问题。BioADAPT-MRC 放宽了对训练表现良好的生物医学-MRC 模型生成伪标签的需求。我们通过将 BioADAPT-MRC 与三个广泛使用的生物医学-MRC 基准数据集(BioASQ-7b、BioASQ-8b 和 BioASQ-9b)上的最佳现有方法进行比较,对其性能进行了广泛评估。我们的结果表明,在不使用任何来自生物医学领域的合成或人工标注数据的情况下,BioADAPT-MRC 可以在这些数据集上实现最先进的性能。

可用性和实现

BioADAPT-MRC 可在 https://github.com/mmahbub/BioADAPT-MRC 上作为一个开源项目免费获得。

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/444a3aaea153/btac508f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/1ea6a29e66ad/btac508f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/5e6959227fe4/btac508f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/13963eafc2ca/btac508f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/22c599f92e8a/btac508f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/6ce4c876d4c8/btac508f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/444a3aaea153/btac508f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/1ea6a29e66ad/btac508f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/5e6959227fe4/btac508f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/13963eafc2ca/btac508f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/22c599f92e8a/btac508f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/6ce4c876d4c8/btac508f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/decd/9477526/444a3aaea153/btac508f6.jpg

相似文献

1
BioADAPT-MRC: adversarial learning-based domain adaptation improves biomedical machine reading comprehension task.BioADAPT-MRC:基于对抗学习的领域自适应提高生物医学机器阅读理解任务。
Bioinformatics. 2022 Sep 15;38(18):4369-4379. doi: 10.1093/bioinformatics/btac508.
2
Generalizing biomedical relation classification with neural adversarial domain adaptation.基于神经对抗域适应的生物医学关系分类泛化。
Bioinformatics. 2018 Sep 1;34(17):2973-2981. doi: 10.1093/bioinformatics/bty190.
3
Clinical concept and relation extraction using prompt-based machine reading comprehension.基于提示的机器阅读理解的临床概念和关系抽取。
J Am Med Inform Assoc. 2023 Aug 18;30(9):1486-1493. doi: 10.1093/jamia/ocad107.
4
AITL: Adversarial Inductive Transfer Learning with input and output space adaptation for pharmacogenomics.AITL:对抗式归纳迁移学习,具有输入和输出空间适配,用于药物基因组学。
Bioinformatics. 2020 Jul 1;36(Suppl_1):i380-i388. doi: 10.1093/bioinformatics/btaa442.
5
Transfer learning for biomedical named entity recognition with neural networks.基于神经网络的生物医学命名实体识别的迁移学习。
Bioinformatics. 2018 Dec 1;34(23):4087-4094. doi: 10.1093/bioinformatics/bty449.
6
A Pilot Study of Biomedical Text Comprehension using an Attention-Based Deep Neural Reader: Design and Experimental Analysis.一项使用基于注意力的深度神经阅读器进行生物医学文本理解的初步研究:设计与实验分析。
JMIR Med Inform. 2018 Jan 5;6(1):e2. doi: 10.2196/medinform.8751.
7
Cross-type biomedical named entity recognition with deep multi-task learning.基于深度多任务学习的跨类型生物医学命名实体识别。
Bioinformatics. 2019 May 15;35(10):1745-1752. doi: 10.1093/bioinformatics/bty869.
8
MRC4BioER: Joint extraction of biomedical entities and relations in the machine reading comprehension framework.MRC4BioER:机器阅读理解框架中的生物医学实体和关系联合抽取。
J Biomed Inform. 2022 Jan;125:103956. doi: 10.1016/j.jbi.2021.103956. Epub 2021 Nov 27.
9
Efficient Machine Reading Comprehension for Health Care Applications: Algorithm Development and Validation of a Context Extraction Approach.用于医疗保健应用的高效机器阅读理解:上下文提取方法的算法开发与验证
JMIR Form Res. 2024 Mar 25;8:e52482. doi: 10.2196/52482.
10
Enhancement of Target-Oriented Opinion Words Extraction with Multiview-Trained Machine Reading Comprehension Model.基于多视图训练的机器阅读理解模型增强面向目标的观点词提取
Comput Intell Neurosci. 2021 Mar 27;2021:6645871. doi: 10.1155/2021/6645871. eCollection 2021.

引用本文的文献

1
Chatbot for the Return of Positive Genetic Screening Results for Hereditary Cancer Syndromes: Prompt Engineering Project.遗传性癌症综合征阳性基因筛查结果返回的聊天机器人:提示工程设计项目
JMIR Cancer. 2025 Jun 10;11:e65848. doi: 10.2196/65848.
2
Chatbot for the Return of Positive Genetic Screening Results for Hereditary Cancer Syndromes: a Prompt Engineering Study.用于遗传性癌症综合征阳性基因筛查结果反馈的聊天机器人:一项提示工程研究
Res Sq. 2024 Aug 29:rs.3.rs-4986527. doi: 10.21203/rs.3.rs-4986527/v1.
3
Question-answering system extracts information on injection drug use from clinical notes.

本文引用的文献

1
Unstructured clinical notes within the 24 hours since admission predict short, mid & long-term mortality in adult ICU patients.入院后 24 小时内的非结构化临床记录可预测成人 ICU 患者的短期、中期和长期死亡率。
PLoS One. 2022 Jan 6;17(1):e0262182. doi: 10.1371/journal.pone.0262182. eCollection 2022.
2
Identification of asthma control factor in clinical notes using a hybrid deep learning model.使用混合深度学习模型从临床记录中识别哮喘控制因素。
BMC Med Inform Decis Mak. 2021 Nov 9;21(Suppl 7):272. doi: 10.1186/s12911-021-01633-4.
3
External features enriched model for biomedical question answering.
问答系统从临床记录中提取有关注射吸毒的信息。
Commun Med (Lond). 2024 Apr 3;4(1):61. doi: 10.1038/s43856-024-00470-6.
4
Empowering personalized pharmacogenomics with generative AI solutions.利用生成式人工智能解决方案增强个性化药物基因组学。
J Am Med Inform Assoc. 2024 May 20;31(6):1356-1366. doi: 10.1093/jamia/ocae039.
生物医学问答的外部特征丰富模型。
BMC Bioinformatics. 2021 May 26;22(1):272. doi: 10.1186/s12859-021-04176-7.
4
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
5
Generalizing biomedical relation classification with neural adversarial domain adaptation.基于神经对抗域适应的生物医学关系分类泛化。
Bioinformatics. 2018 Sep 1;34(17):2973-2981. doi: 10.1093/bioinformatics/bty190.
6
Expert Search Strategies: The Information Retrieval Practices of Healthcare Information Professionals.专家搜索策略:医疗信息专业人员的信息检索实践
JMIR Med Inform. 2017 Oct 2;5(4):e33. doi: 10.2196/medinform.7680.
7
An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition.BIOASQ大规模生物医学语义索引与问答竞赛概述。
BMC Bioinformatics. 2015 Apr 30;16:138. doi: 10.1186/s12859-015-0564-6.
8
Clinical questions raised by clinicians at the point of care: a systematic review.临床医生在护理点提出的临床问题:系统评价。
JAMA Intern Med. 2014 May;174(5):710-8. doi: 10.1001/jamainternmed.2014.368.
9
Redundancy in electronic health record corpora: analysis, impact on text mining performance and mitigation strategies.电子健康记录语料库中的冗余:分析、对文本挖掘性能的影响和缓解策略。
BMC Bioinformatics. 2013 Jan 16;14:10. doi: 10.1186/1471-2105-14-10.
10
Seventy-five trials and eleven systematic reviews a day: how will we ever keep up?每天要处理七十五个试验和十一个系统评价:我们怎么才能跟得上?
PLoS Med. 2010 Sep 21;7(9):e1000326. doi: 10.1371/journal.pmed.1000326.