Suppr超能文献

利用新的预期答案类型分类法实现消费者癌症相关问题的自动分类。

Toward automated classification of consumers' cancer-related questions with a new taxonomy of expected answer types.

作者信息

McRoy Susan, Jones Sean, Kurmally Adam

机构信息

University of Wisconsin-Milwaukee, USA

University of Wisconsin-Milwaukee, USA.

出版信息

Health Informatics J. 2016 Sep;22(3):523-35. doi: 10.1177/1460458215571643. Epub 2015 Mar 10.

Abstract

This article examines methods for automated question classification applied to cancer-related questions that people have asked on the web. This work is part of a broader effort to provide automated question answering for health education. We created a new corpus of consumer-health questions related to cancer and a new taxonomy for those questions. We then compared the effectiveness of different statistical methods for developing classifiers, including weighted classification and resampling. Basic methods for building classifiers were limited by the high variability in the natural distribution of questions and typical refinement approaches of feature selection and merging categories achieved only small improvements to classifier accuracy. Best performance was achieved using weighted classification and resampling methods, the latter yielding an accuracy of F1 = 0.963. Thus, it would appear that statistical classifiers can be trained on natural data, but only if natural distributions of classes are smoothed. Such classifiers would be useful for automated question answering, for enriching web-based content, or assisting clinical professionals to answer questions.

摘要

本文探讨了应用于人们在网络上提出的与癌症相关问题的自动问题分类方法。这项工作是为健康教育提供自动问答这一更广泛努力的一部分。我们创建了一个与癌症相关的消费者健康问题新语料库以及这些问题的新分类法。然后,我们比较了不同统计方法在开发分类器方面的有效性,包括加权分类和重采样。构建分类器的基本方法受到问题自然分布的高变异性限制,而特征选择和合并类别的典型细化方法仅使分类器准确性有小幅提高。使用加权分类和重采样方法取得了最佳性能,后者的F1准确率为0.963。因此,似乎统计分类器可以在自然数据上进行训练,但前提是类别的自然分布要平滑。这样的分类器对于自动问答、丰富基于网络的内容或协助临床专业人员回答问题将是有用的。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验