Zhu Xian, Chen Yuanyuan, Gu Yueming, Xiao Zhifeng
School of Information Management, Nanjing University, Nanjing, China.
School of Health Economics and Management, Nanjing University of Chinese Medicine, Nanjing, China.
Front Neurorobot. 2022 Mar 10;16:773329. doi: 10.3389/fnbot.2022.773329. eCollection 2022.
Recent advances have witnessed a trending application of transfer learning in a broad spectrum of natural language processing (NLP) tasks, including question answering (QA). Transfer learning allows a model to inherit domain knowledge obtained from an existing model that has been sufficiently pre-trained. In the biomedical field, most QA datasets are limited by insufficient training examples and the presence of factoid questions. This study proposes a transfer learning-based sentiment-aware model, named SentiMedQAer, for biomedical QA. The proposed method consists of a learning pipeline that utilizes BioBERT to encode text tokens with contextual and domain-specific embeddings, fine-tunes Text-to-Text Transfer Transformer (T5), and RoBERTa models to integrate sentiment information into the model, and trains an XGBoost classifier to output a confidence score to determine the final answer to the question. We validate SentiMedQAer on PubMedQA, a biomedical QA dataset with reasoning-required yes/no questions. Results show that our method outperforms the SOTA by 15.83% and a single human annotator by 5.91%.
最近的进展表明,迁移学习在包括问答(QA)在内的广泛自然语言处理(NLP)任务中得到了越来越多的应用。迁移学习允许模型继承从经过充分预训练的现有模型中获得的领域知识。在生物医学领域,大多数QA数据集受到训练示例不足和事实性问题存在的限制。本研究提出了一种基于迁移学习的情感感知模型,名为SentiMedQAer,用于生物医学QA。所提出的方法包括一个学习管道,该管道利用BioBERT对具有上下文和特定领域嵌入的文本令牌进行编码,对文本到文本迁移变换器(T5)和RoBERTa模型进行微调,以将情感信息集成到模型中,并训练一个XGBoost分类器输出置信度分数,以确定问题的最终答案。我们在PubMedQA上验证了SentiMedQAer,这是一个具有需要推理的是非问题的生物医学QA数据集。结果表明,我们的方法比当前最优方法高出15.83%,比单个人类注释者高出5.91%。