• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过数据增强和模型加权改进生物医学问答

Improving Biomedical Question Answering by Data Augmentation and Model Weighting.

作者信息

Du Yongping, Yan Jingya, Lu Yuxuan, Zhao Yiliang, Jin Xingnan

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1114-1124. doi: 10.1109/TCBB.2022.3171388. Epub 2023 Apr 3.

DOI:10.1109/TCBB.2022.3171388
PMID:35486563
Abstract

Biomedical Question Answering aims to extract an answer to the given question from a biomedical context. Due to the strong professionalism of specific domain, it's more difficult to build large-scale datasets for specific domain question answering. Existing methods are limited by the lack of training data, and the performance is not as good as in open-domain settings, especially degrading when facing to the adversarial sample. We try to resolve the above issues. First, effective data augmentation strategies are adopted to improve the model training, including slide window, summarization and round-trip translation. Second, we propose a model weighting strategy for the final answer prediction in biomedical domain, which combines the advantage of two models, open-domain model QANet and BioBERT pre-trained in biomedical domain data. Finally, we give adversarial training to reinforce the robustness of the model. The public biomedical dataset collected from PubMed provided by BioASQ challenge is used to evaluate our approach. The results show that the model performance has been improved significantly compared to the single model and other models participated in BioASQ challenge. It can learn richer semantic expression from data augmentation and adversarial samples, which is beneficial to solve more complex question answering problems in biomedical domain.

摘要

生物医学问答旨在从生物医学语境中提取给定问题的答案。由于特定领域的专业性很强,为特定领域的问答构建大规模数据集更加困难。现有方法受到训练数据缺乏的限制,其性能不如开放域设置中的性能,尤其是在面对对抗样本时会下降。我们试图解决上述问题。首先,采用有效的数据增强策略来改进模型训练,包括滑动窗口、摘要和往返翻译。其次,我们为生物医学领域的最终答案预测提出了一种模型加权策略,该策略结合了开放域模型QANet和在生物医学领域数据中预训练的BioBERT这两种模型的优势。最后,我们进行对抗训练以增强模型的鲁棒性。使用从BioASQ挑战赛提供的PubMed中收集的公共生物医学数据集来评估我们的方法。结果表明,与单个模型和参加BioASQ挑战赛的其他模型相比,该模型性能有了显著提高。它可以从数据增强和对抗样本中学习更丰富的语义表达,这有利于解决生物医学领域中更复杂的问答问题。

相似文献

1
Improving Biomedical Question Answering by Data Augmentation and Model Weighting.通过数据增强和模型加权改进生物医学问答
IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1114-1124. doi: 10.1109/TCBB.2022.3171388. Epub 2023 Apr 3.
2
An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition.BIOASQ大规模生物医学语义索引与问答竞赛概述。
BMC Bioinformatics. 2015 Apr 30;16:138. doi: 10.1186/s12859-015-0564-6.
3
A Machine Learning-based Method for Question Type Classification in Biomedical Question Answering.一种基于机器学习的生物医学问答中问题类型分类方法。
Methods Inf Med. 2017 May 18;56(3):209-216. doi: 10.3414/ME16-01-0116. Epub 2017 Mar 31.
4
Deep scaled dot-product attention based domain adaptation model for biomedical question answering.基于深度尺度点积注意力的生物医学问答领域自适应模型。
Methods. 2020 Feb 15;173:69-74. doi: 10.1016/j.ymeth.2019.06.024. Epub 2019 Jun 26.
5
Word embeddings and external resources for answer processing in biomedical factoid question answering.词向量和外部资源在生物医学事实问答中的答案处理
J Biomed Inform. 2019 Apr;92:103118. doi: 10.1016/j.jbi.2019.103118. Epub 2019 Feb 10.
6
Multi-label biomedical question classification for lexical answer type prediction.多标签生物医学问题分类用于词汇答案类型预测。
J Biomed Inform. 2019 May;93:103143. doi: 10.1016/j.jbi.2019.103143. Epub 2019 Mar 12.
7
SemBioNLQA: A semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions.SemBioNLQA:一个语义生物医学问答系统,用于检索自然语言问题的准确和理想答案。
Artif Intell Med. 2020 Jan;102:101767. doi: 10.1016/j.artmed.2019.101767. Epub 2019 Nov 28.
8
External features enriched model for biomedical question answering.生物医学问答的外部特征丰富模型。
BMC Bioinformatics. 2021 May 26;22(1):272. doi: 10.1186/s12859-021-04176-7.
9
Named Entity Aware Transfer Learning for Biomedical Factoid Question Answering.命名实体感知迁移学习在生物医学事实问答中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2365-2376. doi: 10.1109/TCBB.2021.3079339. Epub 2022 Aug 8.
10
Adversarial Knowledge Distillation Based Biomedical Factoid Question Answering.基于对抗性知识蒸馏的生物医学事实问答。
IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):106-118. doi: 10.1109/TCBB.2022.3161032. Epub 2023 Feb 3.

引用本文的文献

1
Rethinking Human-AI Collaboration in Complex Medical Decision Making: A Case Study in Sepsis Diagnosis.重新思考复杂医疗决策中的人机协作:脓毒症诊断案例研究
Proc SIGCHI Conf Hum Factor Comput Syst. 2024 May;2024. doi: 10.1145/3613904.3642343. Epub 2024 May 11.
2
Question answering systems for health professionals at the point of care-a systematic review.在护理点为医疗保健专业人员提供问答系统——系统评价。
J Am Med Inform Assoc. 2024 Apr 3;31(4):1009-1024. doi: 10.1093/jamia/ocae015.