适应医学语言特点的快速精确字符串模式匹配算法。

Fast exact string pattern-matching algorithms adapted to the characteristics of the medical language.

作者信息

Lovis C, Baud R H

机构信息

Puget Sound Health Care System, Seattle, Washington, USA.

出版信息

J Am Med Inform Assoc. 2000 Jul-Aug;7(4):378-91. doi: 10.1136/jamia.2000.0070378.

DOI:10.1136/jamia.2000.0070378

PMID:10887166

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC61442/

Abstract

OBJECTIVE

The authors consider the problem of exact string pattern matching using algorithms that do not require any preprocessing. To choose the most appropriate algorithm, distinctive features of the medical language must be taken into account. The characteristics of medical language are emphasized in this regard, the best algorithm of those reviewed is proposed, and detailed evaluations of time complexity for processing medical texts are provided.

DESIGN

The authors first illustrate and discuss the techniques of various string pattern-matching algorithms. Next, the source code and the behavior of representative exact string pattern-matching algorithms are presented in a comprehensive manner to promote their implementation. Detailed explanations of the use of various techniques to improve performance are given.

MEASUREMENTS

Real-time measures of time complexity with English medical texts are presented. They lead to results distinct from those found in the computer science literature, which are typically computed with normally distributed texts.

RESULTS

The Boyer-Moore-Horspool algorithm achieves the best overall results when used with medical texts. This algorithm usually performs at least twice as fast as the other algorithms tested.

CONCLUSION

The time performance of exact string pattern matching can be greatly improved if an efficient algorithm is used. Considering the growing amount of text handled in the electronic patient record, it is worth implementing this efficient algorithm.

摘要

目的

作者考虑使用无需任何预处理的算法来解决精确字符串模式匹配问题。为了选择最合适的算法，必须考虑医学语言的独特特征。在此方面强调了医学语言的特点，提出了所审查算法中最佳的算法，并提供了处理医学文本的时间复杂度的详细评估。

设计

作者首先说明并讨论各种字符串模式匹配算法的技术。接下来，全面展示代表性精确字符串模式匹配算法的源代码和行为，以促进其实现。给出了使用各种技术提高性能的详细解释。

测量

给出了对英文医学文本时间复杂度的实时测量结果。这些结果与计算机科学文献中的结果不同，后者通常是用正态分布文本计算得出的。

结果

Boyer-Moore-Horspool算法与医学文本一起使用时能取得最佳的总体结果。该算法的执行速度通常至少是其他测试算法的两倍。

结论

如果使用高效算法，精确字符串模式匹配的时间性能可以大大提高。考虑到电子病历中处理的文本量不断增加，值得实施这种高效算法。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

适应医学语言特点的快速精确字符串模式匹配算法。

Fast exact string pattern-matching algorithms adapted to the characteristics of the medical language.

作者信息

机构信息

出版信息

OBJECTIVE

DESIGN

MEASUREMENTS

RESULTS

CONCLUSION

目的

设计

测量

结果

结论

相似文献

引用本文的文献

本文引用的文献

相似文献

引用本文的文献

本文引用的文献

适应医学语言特点的快速精确字符串模式匹配算法。

Fast exact string pattern-matching algorithms adapted to the characteristics of the medical language.

作者信息

机构信息

出版信息

OBJECTIVE

DESIGN

MEASUREMENTS

RESULTS

CONCLUSION

目的

设计

测量

结果

结论