Hassanzadeh Oktie, Zhu Qian, Freimuth Robert, Boyce Richard
IBM Research, Yorktown Heights, NY.
AMIA Jt Summits Transl Sci Proc. 2013 Mar 18;2013:64-8. eCollection 2013.
Structured Product Labels (SPLs) contain information about drugs that can be valuable to clinical and translational research, especially if it can be linked to other sources that provide data about drug targets, chemical properties, interactions, and biological pathways. Unfortunately, SPLs currently provide coarsely-structured drug information and lack the detailed annotation that is required to support computational use cases. To help address this issue we created LinkedSPLs, a Linked Data resource that extends the "web of drug identity" using information extracted from SPLs. In this paper we describe the mapping that LinkedSPLs provides between SPL active ingredients and DrugBank chemical entities. These mappings were created using three approaches: InChI chemical structure descriptors comparison, exact string matching based on the chemical name, and automatic (unsupervised) linkage identification. Comparison of the approaches found that, while these three approaches are complementary, the automatic approach performs well in terms of precision and recall.
结构化产品标签(SPL)包含有关药物的信息,这些信息对临床和转化研究可能很有价值,特别是当它可以与提供有关药物靶点、化学性质、相互作用和生物途径数据的其他来源相链接时。不幸的是,目前的SPL提供的药物信息结构粗糙,缺乏支持计算用例所需的详细注释。为了帮助解决这个问题,我们创建了LinkedSPLs,这是一个链接数据资源,它使用从SPL中提取的信息扩展了“药物身份网络”。在本文中,我们描述了LinkedSPLs在SPL活性成分和DrugBank化学实体之间提供的映射。这些映射是使用三种方法创建的:InChI化学结构描述符比较、基于化学名称的精确字符串匹配以及自动(无监督)链接识别。对这些方法的比较发现,虽然这三种方法是互补的,但自动方法在精度和召回率方面表现良好。