• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

问题蕴涵方法在问答中的应用。

A question-entailment approach to question answering.

机构信息

Lister Hill Center, U.S. National Library of Medicine, U.S. National Institutes of Health, Bethesda, MD, USA.

出版信息

BMC Bioinformatics. 2019 Oct 22;20(1):511. doi: 10.1186/s12859-019-3119-4.

DOI:10.1186/s12859-019-3119-4
PMID:31640539
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6805558/
Abstract

BACKGROUND

One of the challenges in large-scale information retrieval (IR) is developing fine-grained and domain-specific methods to answer natural language questions. Despite the availability of numerous sources and datasets for answer retrieval, Question Answering (QA) remains a challenging problem due to the difficulty of the question understanding and answer extraction tasks. One of the promising tracks investigated in QA is mapping new questions to formerly answered questions that are "similar".

RESULTS

We propose a novel QA approach based on Recognizing Question Entailment (RQE) and we describe the QA system and resources that we built and evaluated on real medical questions. First, we compare logistic regression and deep learning methods for RQE using different kinds of datasets including textual inference, question similarity, and entailment in both the open and clinical domains. Second, we combine IR models with the best RQE method to select entailed questions and rank the retrieved answers. To study the end-to-end QA approach, we built the MedQuAD collection of 47,457 question-answer pairs from trusted medical sources which we introduce and share in the scope of this paper. Following the evaluation process used in TREC 2017 LiveQA, we find that our approach exceeds the best results of the medical task with a 29.8% increase over the best official score.

CONCLUSIONS

The evaluation results support the relevance of question entailment for QA and highlight the effectiveness of combining IR and RQE for future QA efforts. Our findings also show that relying on a restricted set of reliable answer sources can bring a substantial improvement in medical QA.

摘要

背景

在大规模信息检索(IR)中,面临的挑战之一是开发细粒度和特定于领域的方法来回答自然语言问题。尽管有大量的来源和数据集可用于答案检索,但由于问题理解和答案提取任务的难度,问答(QA)仍然是一个具有挑战性的问题。在 QA 中,一个有前途的研究方向是将新问题映射到以前回答过的“相似”问题上。

结果

我们提出了一种基于识别问题蕴涵(RQE)的新 QA 方法,并描述了我们在真实医学问题上构建和评估的 QA 系统和资源。首先,我们使用不同类型的数据集(包括文本推理、问题相似性和蕴涵)比较了逻辑回归和深度学习方法在开放和临床领域的 RQE 中的应用。其次,我们将 IR 模型与最佳 RQE 方法相结合,以选择蕴涵问题并对检索到的答案进行排名。为了研究端到端 QA 方法,我们从可信的医学资源中构建了包含 47457 个问答对的 MedQuAD 数据集,我们在本文的范围内介绍并共享了该数据集。根据 TREC 2017 LiveQA 使用的评估过程,我们发现我们的方法超过了医学任务的最佳结果,比最佳官方分数提高了 29.8%。

结论

评估结果支持问题蕴涵对于 QA 的相关性,并强调了结合 IR 和 RQE 对于未来 QA 工作的有效性。我们的研究结果还表明,仅依靠一组受限制的可靠答案来源就可以大大提高医学 QA 的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a08/6805558/9aca9233c029/12859_2019_3119_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a08/6805558/8292b21c6bef/12859_2019_3119_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a08/6805558/564dd270d246/12859_2019_3119_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a08/6805558/0bdcbef2149b/12859_2019_3119_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a08/6805558/2f1ced6eb7a6/12859_2019_3119_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a08/6805558/9aca9233c029/12859_2019_3119_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a08/6805558/8292b21c6bef/12859_2019_3119_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a08/6805558/564dd270d246/12859_2019_3119_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a08/6805558/0bdcbef2149b/12859_2019_3119_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a08/6805558/2f1ced6eb7a6/12859_2019_3119_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a08/6805558/9aca9233c029/12859_2019_3119_Fig5_HTML.jpg

相似文献

1
A question-entailment approach to question answering.问题蕴涵方法在问答中的应用。
BMC Bioinformatics. 2019 Oct 22;20(1):511. doi: 10.1186/s12859-019-3119-4.
2
Epidemic Question Answering: question generation and entailment for Answer Nugget discovery.疫情问答:答案片段发现的问题生成和蕴涵。
J Am Med Inform Assoc. 2023 Jan 18;30(2):329-339. doi: 10.1093/jamia/ocac222.
3
SemBioNLQA: A semantic biomedical question answering system for retrieving exact and ideal answers to natural language questions.SemBioNLQA:一个语义生物医学问答系统,用于检索自然语言问题的准确和理想答案。
Artif Intell Med. 2020 Jan;102:101767. doi: 10.1016/j.artmed.2019.101767. Epub 2019 Nov 28.
4
Recognizing Question Entailment for Medical Question Answering.识别医学问答中的问题蕴含关系。
AMIA Annu Symp Proc. 2017 Feb 10;2016:310-318. eCollection 2016.
5
Consumer health information and question answering: helping consumers find answers to their health-related information needs.消费者健康信息与问答:帮助消费者寻找与其健康相关的信息需求的答案。
J Am Med Inform Assoc. 2020 Feb 1;27(2):194-201. doi: 10.1093/jamia/ocz152.
6
Word embeddings and external resources for answer processing in biomedical factoid question answering.词向量和外部资源在生物医学事实问答中的答案处理
J Biomed Inform. 2019 Apr;92:103118. doi: 10.1016/j.jbi.2019.103118. Epub 2019 Feb 10.
7
On the Role of Question Summarization and Information Source Restriction in Consumer Health Question Answering.问题总结与信息源限制在消费者健康问答中的作用
AMIA Jt Summits Transl Sci Proc. 2019 May 6;2019:117-126. eCollection 2019.
8
A passage retrieval method based on probabilistic information retrieval model and UMLS concepts in biomedical question answering.一种基于概率信息检索模型和统一医学语言系统(UMLS)概念的生物医学问答中的段落检索方法。
J Biomed Inform. 2017 Apr;68:96-103. doi: 10.1016/j.jbi.2017.03.001. Epub 2017 Mar 7.
9
Identifying the Question Similarity of Regulatory Documents in the Pharmaceutical Industry by Using the Recognizing Question Entailment System: Evaluation Study.利用识别问题蕴含系统识别制药行业监管文件中的问题相似性:评估研究
JMIR AI. 2023 Sep 26;2:e43483. doi: 10.2196/43483.
10
A Semi-Supervised Learning Approach to Enhance Health Care Community-Based Question Answering: A Case Study in Alcoholism.一种基于半监督学习的方法,用于增强医疗保健社区问答:以酗酒为例的研究。
JMIR Med Inform. 2016 Aug 2;4(3):e24. doi: 10.2196/medinform.5490.

引用本文的文献

1
Medical LLMs: Fine-Tuning vs. Retrieval-Augmented Generation.医学大语言模型:微调与检索增强生成
Bioengineering (Basel). 2025 Jun 24;12(7):687. doi: 10.3390/bioengineering12070687.
2
EYE-Llama, an in-domain large language model for ophthalmology.EYE-Llama,一种用于眼科领域的大语言模型。
iScience. 2025 Jun 23;28(7):112984. doi: 10.1016/j.isci.2025.112984. eCollection 2025 Jul 18.
3
Large language model trained on clinical oncology data predicts cancer progression.基于临床肿瘤学数据训练的大语言模型可预测癌症进展。

本文引用的文献

1
Consumer health information and question answering: helping consumers find answers to their health-related information needs.消费者健康信息与问答:帮助消费者寻找与其健康相关的信息需求的答案。
J Am Med Inform Assoc. 2020 Feb 1;27(2):194-201. doi: 10.1093/jamia/ocz152.
2
Semantic annotation of consumer health questions.消费者健康问题的语义标注。
BMC Bioinformatics. 2018 Feb 6;19(1):34. doi: 10.1186/s12859-018-2045-1.
3
Expert Search Strategies: The Information Retrieval Practices of Healthcare Information Professionals.专家搜索策略:医疗信息专业人员的信息检索实践
NPJ Digit Med. 2025 Jul 2;8(1):397. doi: 10.1038/s41746-025-01780-2.
4
The Venus score for the assessment of the quality and trustworthiness of biomedical datasets.用于评估生物医学数据集质量和可信度的维纳斯评分。
BioData Min. 2025 Jan 9;18(1):1. doi: 10.1186/s13040-024-00412-x.
5
Large language models in health care: Development, applications, and challenges.医疗保健领域的大语言模型:发展、应用与挑战。
Health Care Sci. 2023 Jul 24;2(4):255-263. doi: 10.1002/hcs2.61. eCollection 2023 Aug.
6
Identifying the Question Similarity of Regulatory Documents in the Pharmaceutical Industry by Using the Recognizing Question Entailment System: Evaluation Study.利用识别问题蕴含系统识别制药行业监管文件中的问题相似性:评估研究
JMIR AI. 2023 Sep 26;2:e43483. doi: 10.2196/43483.
7
EYE-Llama, an in-domain large language model for ophthalmology.EYE-Llama,一种用于眼科领域的大语言模型。
bioRxiv. 2024 Apr 29:2024.04.26.591355. doi: 10.1101/2024.04.26.591355.
8
Question answering systems for health professionals at the point of care-a systematic review.在护理点为医疗保健专业人员提供问答系统——系统评价。
J Am Med Inform Assoc. 2024 Apr 3;31(4):1009-1024. doi: 10.1093/jamia/ocae015.
9
Automatic extraction of social determinants of health from medical notes of chronic lower back pain patients.从慢性下背痛患者的病历中自动提取健康的社会决定因素。
J Am Med Inform Assoc. 2023 Jul 19;30(8):1438-1447. doi: 10.1093/jamia/ocad054.
10
Classification of neurologic outcomes from medical notes using natural language processing.使用自然语言处理技术从医学记录中对神经学结果进行分类。
Expert Syst Appl. 2023 Mar 15;214. doi: 10.1016/j.eswa.2022.119171. Epub 2022 Nov 6.
JMIR Med Inform. 2017 Oct 2;5(4):e33. doi: 10.2196/medinform.7680.
4
Combining Open-domain and Biomedical Knowledge for Topic Recognition in Consumer Health Questions.结合开放域知识与生物医学知识用于消费者健康问题中的主题识别
AMIA Annu Symp Proc. 2017 Feb 10;2016:914-923. eCollection 2016.
5
Recognizing Question Entailment for Medical Question Answering.识别医学问答中的问题蕴含关系。
AMIA Annu Symp Proc. 2017 Feb 10;2016:310-318. eCollection 2016.
6
MetaMap Lite: an evaluation of a new Java implementation of MetaMap.MetaMap精简版:对MetaMap新Java实现的评估
J Am Med Inform Assoc. 2017 Jul 1;24(4):841-844. doi: 10.1093/jamia/ocw177.
7
Interactive use of online health resources: a comparison of consumer and professional questions.在线健康资源的交互使用:消费者问题与专业问题的比较
J Am Med Inform Assoc. 2016 Jul;23(4):802-11. doi: 10.1093/jamia/ocw024. Epub 2016 May 4.
8
SimQ: real-time retrieval of similar consumer health questions.SimQ:相似消费者健康问题的实时检索
J Med Internet Res. 2015 Feb 17;17(2):e43. doi: 10.2196/jmir.3388.
9
Biomedical question answering: a survey.生物医学问答:综述。
Comput Methods Programs Biomed. 2010 Jul;99(1):1-24. doi: 10.1016/j.cmpb.2009.10.003. Epub 2009 Nov 13.
10
A taxonomy of generic clinical questions: classification study.一般临床问题的分类法:分类研究
BMJ. 2000 Aug 12;321(7258):429-32. doi: 10.1136/bmj.321.7258.429.