Rance Bastien, Snyder Michelle, Lewis Janine, Bodenreider Olivier
U.S. National Library of Medicine, National Institutes of Health, Bethesda, Maryland, USA.
Stud Health Technol Inform. 2013;192:529-33.
Rare disease information sources are incompletely and inconsistently cross-referenced to one another, making it difficult for information seekers to navigate across them. The development of such cross-references established manually by experts is generally labor intensive and costly.
To develop an automatic mapping between two of the major rare diseases information sources, GARD and Orphanet, by leveraging terminological resources, especially the UMLS.
We map the rare disease terms from Orphanet and ORDR to the UMLS. We use the UMLS as a pivot to bridge between the rare disease terminologies. We compare our results to a mapping obtained through manually established cross-references to OMIM.
Our mapping has a precision of 94%, a recall of 63% and an F1-score of 76%. Our automatic mapping should help facilitate the development of more complete and consistent cross-references between GARD and Orphanet, and is applicable to other rare disease information sources as well.
罕见病信息来源之间的交叉引用不完整且不一致,这使得信息查找者难以在这些来源之间进行导航。由专家手动建立此类交叉引用通常劳动强度大且成本高。
通过利用术语资源,特别是统一医学语言系统(UMLS),在两个主要的罕见病信息来源——罕见病全球基因库(GARD)和孤儿病数据库(Orphanet)之间建立自动映射。
我们将来自Orphanet和ORDR的罕见病术语映射到UMLS。我们使用UMLS作为枢纽来连接罕见病术语。我们将我们的结果与通过手动建立的与在线孟德尔人类遗传数据库(OMIM)的交叉引用所获得的映射进行比较。
我们的映射精确率为94%,召回率为63%,F1分数为76%。我们的自动映射应有助于促进GARD和Orphanet之间更完整和一致的交叉引用的开发,并且也适用于其他罕见病信息来源。