Suppr超能文献

一项关于自动足月分娩解剖的随机双盲对照试验。

A randomized double-blind controlled trial of automated term dissection.

作者信息

Elkin P L, Bailey K R, Ogren P V, Bauer B A, Chute C G

机构信息

Mayo Foundation, Rochester, MN, USA.

出版信息

Proc AMIA Symp. 1999:62-6.

Abstract

OBJECTIVE

To compare the accuracy of an automated mechanism for term dissection to represent the semantic dependencies within a compositional expression, with the accuracy of a practicing Internist to perform this same task. We also compare the results of four evaluators to determine the inter-observer variability and the variance between term sets, with respect to the accuracy of the mappings and the consistency of the failure analysis.

METHODS

500 terms, which required a compositional expression to effect an exact match, were randomly distributed into two sets of 250 terms (Set A and Set B). Set A was dissected using the Automated Term Dissection (ATD) Algorithm. A physician specializing in Internal Medicine dissected set B. He had no prior knowledge of the dissection algorithm or how it functioned. In this manuscript, the authors use Human Term Dissection (HTD) to refer to this method. Set A was randomized to two sets of 125 terms (Set A1 and Set A2). Set B was randomized to two sets of 125 terms (Set B1 and Set B2). A new set of 250 terms Set C was created from Set A1 and Set B2. A second new set of 250 terms Set D was created from Set A2 and Set B1. Two expert Indexers reviewed Set C and another two expert Indexers reviewed Set D. They were blinded to which terms were dissected by the clinician and which terms were dissected by the automated term dissection algorithm. The person providing the files for review to the Indexers was also unaware of which terms were dissected by ATD vs. the HTD method. The Indexers recorded whether or not the dissection was the best possible representation of the input concept. If not, a failure analysis was conducted. They recorded whether or not the dissection was in error and if so was a modifier not subsumed or was a Kernel concept subsumed when it should not have been. If a concept was missing, the Indexers recorded whether it was a Kernel concept, a modifier, a qualifier or a negative qualifier.

RESULTS

The ATD method was judged to be accurate and readable in 265 out of the 424 terms with adequate content (62.7%). The HTD method was judged to be accurate in 272 out of 414 terms with adequate content (65.7%). There was no statistically significant difference between the rates of acceptability of the ATD and HTD methods (p = 0.33). There was a non-significant trend toward greater acceptability of the ATD method in the subgroup of terms with three or more compositional elements. ATD was acceptable in 53.6% of the terms where the HTD was only acceptable in 43.6% (p = 0.11). The failure analysis showed that both methods misrepresented kernel concepts and modifiers much more commonly than qualifiers (p < 0.001).

CONCLUSIONS

There is no statistically significant difference in the accuracy and readability of terms dissected using the automated term dissection method when compared with human term dissection, as judged by four expert medical indexers. There is a non-significant trend toward improved performance of the ATD method in the subset of more complex terms. The authors submit that this may be due to a tendency for users to be less compulsive when the time to complete the task is long. Automated term dissection is a useful and perhaps preferable method for representing readable and accurate compound terminological expressions.

摘要

目的

比较一种用于术语剖析以表示组合表达式中语义依存关系的自动化机制的准确性,与一名内科实习医生执行相同任务的准确性。我们还比较了四位评估者的结果,以确定观察者间的变异性以及术语集之间的差异,涉及映射的准确性和失败分析的一致性。

方法

500个需要组合表达式才能实现精确匹配的术语被随机分为两组,每组250个术语(A组和B组)。A组使用自动术语剖析(ATD)算法进行剖析。一位内科专家剖析B组。他对剖析算法及其工作方式没有先验知识。在本手稿中,作者使用人工术语剖析(HTD)来指代此方法。A组被随机分为两组,每组125个术语(A1组和A2组)。B组被随机分为两组,每组125个术语(B1组和B2组)。从A1组和B2组创建了一组新的250个术语C组。从A2组和B1组创建了另一组新的250个术语D组。两位专家索引员审查C组,另外两位专家索引员审查D组。他们不知道哪些术语是由临床医生剖析的,哪些术语是由自动术语剖析算法剖析的。向索引员提供用于审查的文件的人也不知道哪些术语是由ATD方法剖析的,哪些是由HTD方法剖析的。索引员记录剖析是否是输入概念的最佳可能表示。如果不是,则进行失败分析。他们记录剖析是否有误,如果有误,是修饰词未被包含还是核心概念在不应被包含时被包含了。如果一个概念缺失,索引员记录它是核心概念、修饰词、限定词还是否定限定词。

结果

在424个内容充足的术语中,ATD方法在265个术语中被判定为准确且可读(62.7%)。HTD方法在414个内容充足的术语中,有272个被判定为准确(65.7%)。ATD和HTD方法的可接受率之间没有统计学上的显著差异(p = 0.33)。在具有三个或更多组合元素的术语子集中,ATD方法有更高可接受性的趋势,但不显著。在HTD仅在43.6%的术语中可接受的情况下,ATD在53.6%的术语中可接受(p = 0.11)。失败分析表明,两种方法错误表示核心概念和修饰词的情况比限定词更为常见(p < 0.001)。

结论

由四位医学专家索引员判断,与人工术语剖析相比,使用自动术语剖析方法剖析的术语在准确性和可读性方面没有统计学上的显著差异。在更复杂的术语子集中,ATD方法有性能改善的不显著趋势。作者认为这可能是由于当完成任务的时间较长时,用户往往不那么强迫。自动术语剖析是一种用于表示可读且准确的复合术语表达式的有用且可能更可取的方法。

相似文献

引用本文的文献

3
Using SNOMED CT to represent two interface terminologies.使用SNOMED CT来表示两种接口术语。
J Am Med Inform Assoc. 2009 Jan-Feb;16(1):81-8. doi: 10.1197/jamia.M2694. Epub 2008 Oct 24.
6
A model for evaluating interface terminologies.一种评估接口术语的模型。
J Am Med Inform Assoc. 2008 Jan-Feb;15(1):65-76. doi: 10.1197/jamia.M2506. Epub 2007 Oct 18.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验