School of Computer Science, University of Manchester, Manchester M13 9PL, UK.
J Biomed Inform. 2012 Apr;45(2):199-209. doi: 10.1016/j.jbi.2011.10.002. Epub 2011 Oct 14.
A study of the use of common qualifiers in SNOMED CT definitions and the resulting classification was undertaken using combined lexical and semantic techniques. The accuracy of SNOMED authors in formulating definitions for pre-coordinated concepts was taken as a proxy for the expected accuracy of users formulating post-coordinated expressions. The study focused on "acute" and "chronic" as used within a module based on the UMLS CORE Problem List and using the pattern of SNOMED CT's definition Acute disease and Chronic disease. Scripts were used to identify potential candidate concepts whose names suggested that they should be classified as acute or chronic findings. The potential candidates were filtered by local clinical experts to eliminate spurious lexical matches. Scripts were then use to determine which of the filtered candidates were not classified under acute or chronic findings as expected. The results were that 28% and 20% of candidate chronic and acute concepts, respectively, were not so classified. Of these candidate misclassifications, the large majority occurred because "acute" and "chronic" are sometimes specified by qualifiers for clinical course and sometimes for morphology, a fact mentioned but not fully detailed in the User Guide distributed with the SNOMED releases. This heterogeneous representation reflects a potential conflict between common usage in patient care and SNOMED's origins in pathology. Other incidental findings included questions about the qualifier hierarchies themselves and issues with the underlying model for anatomy. The effort required for the study was kept modest by using module extraction and scripts, showing that such quality assurance of SNOMED is practical. The results of a preliminary study using proxy measures must be taken with caution. However, the high rate of misclassification indicates that, until the specifications for qualifiers are better documented and/or brought more in line with common clinical usage, anyone attempting to use post-coordination in SNOMED CT must be aware that there are significant pitfalls.
采用词汇和语义相结合的方法对 SNOMED CT 定义中常用限定词的使用及分类进行了研究。将 SNOMED 作者对预协调概念定义的准确性作为用户对后协调表达式进行定义的预期准确性的替代。研究集中在基于 UMLS CORE 问题列表的模块中“急性”和“慢性”的使用,以及 SNOMED CT 定义“急性疾病”和“慢性疾病”的模式。脚本用于识别潜在的候选概念,其名称表明它们应被归类为急性或慢性发现。通过本地临床专家过滤潜在候选者,以消除虚假的词汇匹配。然后使用脚本确定哪些经过过滤的候选者未按预期归类为急性或慢性发现。结果发现,候选慢性和急性概念分别有 28%和 20%未被归类。在这些候选分类错误中,绝大多数是因为“急性”和“慢性”有时根据临床病程的限定词指定,有时根据形态学指定,这一事实在 SNOMED 发布时分发的用户指南中提到过,但并未详细说明。这种异构表示反映了在患者护理中的常见用法与 SNOMED 起源于病理学之间的潜在冲突。其他偶然发现包括关于限定词层次结构本身的问题以及解剖学基础模型的问题。通过使用模块提取和脚本,研究的工作量保持适度,表明对 SNOMED 进行这种质量保证是可行的。使用代理措施进行初步研究的结果必须谨慎对待。然而,高错误分类率表明,在限定词的规范得到更好的记录和/或与常见临床用法更一致之前,任何试图在后协调中使用 SNOMED CT 的人都必须意识到存在重大陷阱。