Suppr超能文献

一种基于频率的技术,用于提高医学查询中的拼写建议排名。

A frequency-based technique to improve the spelling suggestion rank in medical queries.

作者信息

Crowell Jonathan, Zeng Qing, Ngo Long, Lacroix Eve-Marie

机构信息

Decision Systems Group, Brigham & Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.

出版信息

J Am Med Inform Assoc. 2004 May-Jun;11(3):179-85. doi: 10.1197/jamia.M1474. Epub 2004 Feb 5.

Abstract

OBJECTIVE

There is an abundance of health-related information online, and millions of consumers search for such information. Spell checking is of crucial importance in returning pertinent results, so the authors propose a technique for increasing the effectiveness of spell-checking tools used for health-related information retrieval.

DESIGN

A sample of incorrectly spelled medical terms was submitted to two different spell-checking tools, and the resulting suggestions, derived under two different dictionary configurations, were re-sorted according to how frequently each term appeared in log data from a medical search engine.

MEASUREMENTS

Univariable analysis was carried out to assess the effect of each factor (spell-checking tool, dictionary type, re-sort, or no re-sort) on the probability of success. The factors that were statistically significant in the univariable analysis were then used in multivariable analysis to evaluate the independent effect of each of the factors.

RESULTS

The re-sorted suggestions proved to be significantly more accurate than the original list returned by the spell-checking tool. The odds of finding the correct suggestion in the number one rank were increased by 63% after re-sorting using the authors' method. This effect was independent of both the dictionary and the spell-checking tools that were used.

CONCLUSION

Using knowledge about the frequency of a given word's occurrence in the medical domain can significantly improve spelling correction for medical queries.

摘要

目的

网上有大量与健康相关的信息,数百万消费者搜索此类信息。拼写检查对于返回相关结果至关重要,因此作者提出一种技术,以提高用于健康相关信息检索的拼写检查工具的有效性。

设计

将一组拼写错误的医学术语样本提交给两种不同的拼写检查工具,并根据每个术语在医学搜索引擎日志数据中出现的频率,对在两种不同词典配置下得出的结果建议进行重新排序。

测量

进行单变量分析,以评估每个因素(拼写检查工具、词典类型、重新排序或不重新排序)对成功概率的影响。然后,将单变量分析中具有统计学意义的因素用于多变量分析,以评估每个因素的独立影响。

结果

重新排序后的建议被证明比拼写检查工具返回的原始列表准确得多。使用作者的方法重新排序后,在首位找到正确建议的几率提高了63%。这种效果与所使用的词典和拼写检查工具无关。

结论

利用给定单词在医学领域出现频率的知识,可以显著改善医学查询的拼写校正。

相似文献

4
Matching health information seekers' queries to medical terms.匹配健康信息搜索者的查询与医学术语。
BMC Bioinformatics. 2012;13 Suppl 14(Suppl 14):S11. doi: 10.1186/1471-2105-13-S14-S11. Epub 2012 Sep 7.

引用本文的文献

3
Automatic classification of scanned electronic health record documents.扫描电子健康记录文档的自动分类。
Int J Med Inform. 2020 Dec;144:104302. doi: 10.1016/j.ijmedinf.2020.104302. Epub 2020 Oct 17.
5
Spell checker for consumer language (CSpell).消费者语言拼写检查器(CSpell)。
J Am Med Inform Assoc. 2019 Mar 1;26(3):211-218. doi: 10.1093/jamia/ocy171.
10
Matching health information seekers' queries to medical terms.匹配健康信息搜索者的查询与医学术语。
BMC Bioinformatics. 2012;13 Suppl 14(Suppl 14):S11. doi: 10.1186/1471-2105-13-S14-S11. Epub 2012 Sep 7.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验