New York Institute of Technology, New York, NY, USA.
J Biomed Inform. 2012 Dec;45(6):1042-8. doi: 10.1016/j.jbi.2012.05.006. Epub 2012 Jun 9.
Auditing healthcare terminologies for errors requires human experts. In this paper, we present a study of the performance of auditors looking for errors in the semantic type assignments of complex UMLS concepts. In this study, concepts are considered complex whenever they are assigned combinations of semantic types. Past research has shown that complex concepts have a higher likelihood of errors. The results of this study indicate that individual auditors are not reliable when auditing such concepts and their performance is low, according to various metrics. These results confirm the outcomes of an earlier pilot study. They imply that to achieve an acceptable level of reliability and performance, when auditing such concepts of the UMLS, several auditors need to be assigned the same task. A mechanism is then needed to combine the possibly differing opinions of the different auditors into a final determination. In the current study, in contrast to our previous work, we used a majority mechanism for this purpose. For a sample of 232 complex UMLS concepts, the majority opinion was found reliable and its performance for accuracy, recall, precision and the F-measure was found statistically significantly higher than the average performance of individual auditors.
审核医疗保健术语中的错误需要人类专家。在本文中,我们研究了审核员在复杂 UMLS 概念的语义类型分配中查找错误的表现。在这项研究中,只要概念被分配了语义类型的组合,就认为它们是复杂的。过去的研究表明,复杂的概念更有可能出错。根据各种指标,这项研究的结果表明,个体审核员在审核此类概念时不可靠,他们的表现不佳。这些结果证实了早期试点研究的结果。这意味着,要在审核 UMLS 的此类概念时达到可接受的可靠性和性能水平,需要为同一任务分配几个审核员。然后需要一种机制将不同审核员的可能不同意见合并为最终决定。在当前的研究中,与我们之前的工作相比,我们为此目的使用了多数机制。对于 232 个复杂的 UMLS 概念样本,多数意见被发现是可靠的,其在准确性、召回率、精度和 F 度量方面的性能明显高于单个审核员的平均性能。