Suppr超能文献

基于深度排序递归自动编码器的列表式学习对生物医学问答对进行排序。

List-wise learning to rank biomedical question-answer pairs with deep ranking recursive autoencoders.

机构信息

Department of Computer Science and Technology, School of Mechanical Electronic and Information Engineering, China University of Mining and Technology Beijing, Beijing, China.

Alibaba Group, Hangzhou, China.

出版信息

PLoS One. 2020 Nov 9;15(11):e0242061. doi: 10.1371/journal.pone.0242061. eCollection 2020.

Abstract

Biomedical question answering (QA) represents a growing concern among industry and academia due to the crucial impact of biomedical information. When mapping and ranking candidate snippet answers within relevant literature, current QA systems typically refer to information retrieval (IR) techniques: specifically, query processing approaches and ranking models. However, these IR-based approaches are insufficient to consider both syntactic and semantic relatedness and thus cannot formulate accurate natural language answers. Recently, deep learning approaches have become well-known for learning optimal semantic feature representations in natural language processing tasks. In this paper, we present a deep ranking recursive autoencoders (rankingRAE) architecture for ranking question-candidate snippet answer pairs (Q-S) to obtain the most relevant candidate answers for biomedical questions extracted from the potentially relevant documents. In particular, we convert the task of ranking candidate answers to several simultaneous binary classification tasks for determining whether a question and a candidate answer are relevant. The compositional words and their random initialized vectors of concatenated Q-S pairs are fed into recursive autoencoders to learn the optimal semantic representations in an unsupervised way, and their semantic relatedness is classified through supervised learning. Unlike several existing methods to directly choose the top-K candidates with highest probabilities, we take the influence of different ranking results into consideration. Consequently, we define a listwise "ranking error" for loss function computation to penalize inappropriate answer ranking for each question and to eliminate their influence. The proposed architecture is evaluated with respect to the BioASQ 2013-2018 Six-year Biomedical Question Answering benchmarks. Compared with classical IR models, other deep representation models, as well as some state-of-the-art systems for these tasks, the experimental results demonstrate the robustness and effectiveness of rankingRAE.

摘要

生物医学问答 (QA) 由于生物医学信息的关键影响,在工业界和学术界引起了越来越多的关注。在映射和对相关文献中的候选片段答案进行排名时,当前的 QA 系统通常参考信息检索 (IR) 技术:具体来说,查询处理方法和排名模型。但是,这些基于 IR 的方法不足以考虑语法和语义相关性,因此无法形成准确的自然语言答案。最近,深度学习方法在自然语言处理任务中学习最佳语义特征表示方面变得众所周知。在本文中,我们提出了一种深度排名递归自动编码器 (rankingRAE) 架构,用于对问题-候选片段答案对 (Q-S) 进行排名,以从潜在相关文档中获取与生物医学问题最相关的候选答案。具体来说,我们将对候选答案进行排名的任务转换为几个同时的二进制分类任务,以确定问题和候选答案是否相关。将 Q-S 对的组合词及其随机初始化向量作为输入,通过递归自动编码器以无监督的方式学习最佳语义表示,然后通过监督学习对它们的语义相关性进行分类。与直接选择具有最高概率的前 K 个候选者的几种现有方法不同,我们考虑了不同排名结果的影响。因此,我们定义了一个列表级别的“排名错误”,用于计算损失函数,以惩罚每个问题中不合适的答案排名,并消除其影响。该架构在 2013-2018 年生物 ASQ 六年度生物医学问答基准上进行了评估。与经典 IR 模型、其他深度表示模型以及这些任务的一些最新系统相比,实验结果证明了 rankingRAE 的稳健性和有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9907/7652278/ca95ffa95609/pone.0242061.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验