Lister Hill National Center for Biomedical Communications, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.
J Am Med Inform Assoc. 2020 Feb 1;27(2):194-201. doi: 10.1093/jamia/ocz152.
Consumers increasingly turn to the internet in search of health-related information; and they want their questions answered with short and precise passages, rather than needing to analyze lists of relevant documents returned by search engines and reading each document to find an answer. We aim to answer consumer health questions with information from reliable sources.
We combine knowledge-based, traditional machine and deep learning approaches to understand consumers' questions and select the best answers from consumer-oriented sources. We evaluate the end-to-end system and its components on simple questions generated in a pilot development of MedlinePlus Alexa skill, as well as the short and long real-life questions submitted to the National Library of Medicine by consumers.
Our system achieves 78.7% mean average precision and 87.9% mean reciprocal rank on simple Alexa questions, and 44.5% mean average precision and 51.6% mean reciprocal rank on real-life questions submitted by National Library of Medicine consumers.
The ensemble of deep learning, domain knowledge, and traditional approaches recognizes question type and focus well in the simple questions, but it leaves room for improvement on the real-life consumers' questions. Information retrieval approaches alone are sufficient for finding answers to simple Alexa questions. Answering real-life questions, however, benefits from a combination of information retrieval and inference approaches.
A pilot practical implementation of research needed to help consumers find reliable answers to their health-related questions demonstrates that for most questions the reliable answers exist and can be found automatically with acceptable accuracy.
消费者越来越倾向于在互联网上搜索与健康相关的信息;他们希望能够通过简短而精确的段落得到问题的答案,而不需要分析搜索引擎返回的相关文档列表并阅读每一篇文档来寻找答案。我们旨在利用可靠来源的信息来回答消费者的健康问题。
我们结合基于知识、传统机器和深度学习方法来理解消费者的问题,并从面向消费者的资源中选择最佳答案。我们在 MedlinePlus Alexa 技能的试点开发中生成的简单问题以及消费者向国家医学图书馆提交的简短和长的真实问题上评估端到端系统及其组件。
我们的系统在简单的 Alexa 问题上的平均准确率为 78.7%,平均倒数排名为 87.9%,在国家医学图书馆消费者提交的真实问题上的平均准确率为 44.5%,平均倒数排名为 51.6%。
深度学习、领域知识和传统方法的组合在简单问题中能够很好地识别问题类型和焦点,但在真实的消费者问题上还有改进的空间。信息检索方法足以找到简单的 Alexa 问题的答案。然而,回答真实生活中的问题需要结合信息检索和推理方法。
帮助消费者找到与其健康相关问题的可靠答案的研究的实际试点实施表明,对于大多数问题,可靠的答案是存在的,可以以可接受的准确性自动找到。