Hu Ran, Shi Jay, Cui Licong, Abeysinghe Rashmie
McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX.
Intermountain Healthcare, Denver, CO.
AMIA Jt Summits Transl Sci Proc. 2024 May 31;2024:545-554. eCollection 2024.
SNOMED CT is the most comprehensive clinical terminology employed worldwide and enhancing its accuracy is of utmost importance. In this work, we introduce an automated approach to identifying erroneous IS-A relations in SNOMED CT. We first extract linked concept-pairs from which we generate Term Difference Pairs (TDPs) that contain differences between the concepts. Given a TDP, if the reversed TDP also exists and the number of linked-pairs generating this TDP is less than those generating the reversed TDP, then we suggest the former linked-pairs as potentially erroneous IS-A relations. We applied this approach to the Clinical finding and Procedure subhierarchies of the 2022 March US Edition of SNOMED CT, and obtained 52 potentially erroneous IS-A relations and a candidate list of 48 linked-pairs. A domain expert confirmed 41 out of 52 (78.8%) are valid and identified 26 erroneous IS-A relations out of 48 linked-pairs demonstrating the effectiveness of the approach.
SNOMED CT是全球使用的最全面的临床术语,提高其准确性至关重要。在这项工作中,我们引入了一种自动方法来识别SNOMED CT中错误的“是一个”关系。我们首先提取链接的概念对,从中生成包含概念之间差异的术语差异对(TDP)。给定一个TDP,如果反向TDP也存在,并且生成此TDP的链接对数量少于生成反向TDP的链接对数量,那么我们建议将前者的链接对作为潜在错误的“是一个”关系。我们将此方法应用于2022年3月美国版SNOMED CT的临床发现和程序子层次结构,获得了52个潜在错误的“是一个”关系和48个链接对的候选列表。一位领域专家确认,52个中的41个(78.8%)是有效的,并从48个链接对中识别出26个错误的“是一个”关系,证明了该方法的有效性。