Malec Scott A, Boyce Richard D
University of Pittsburgh Department of Biomedical Informatics, Pittsburgh, PA.
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:403-412. eCollection 2020.
This paper introduces a database derived from Structured Product Labels (SPLs). SPLs are legally mandated snapshots containing information on all drugs released to market in the United States. Since publication is not required for pre-trial findings, we hypothesize that SPLs may contain knowledge absent in the literature, and hence "novel." SemMedDB is an existing database of computable knowledge derived from the literature. If SPL content could be similarly transformed, novel clinically relevant assertions in the SPLs could be identified through comparison with SemMedDB. After we derive a database (containing 4,297,481 assertions), we compare the extracted content with SemMedDB for recent FDA drug approvals. We find that novelty between the SPLs and the literature is nuanced, due to the redundancy of SPLs. Highlighting areas for improvement and future work, we conclude that SPLs contain a wealth of novel knowledge relevant to research and complementary to the literature.
本文介绍了一个源自结构化产品标签(SPL)的数据库。SPL是法定要求的快照,包含在美国上市的所有药物的信息。由于审前研究结果无需发表,我们推测SPL可能包含文献中没有的知识,因此具有“新颖性”。SemMedDB是一个现有的从文献中衍生出来的可计算知识数据库。如果SPL的内容能够以类似的方式进行转换,那么通过与SemMedDB进行比较,就可以识别出SPL中新颖的临床相关断言。在我们导出一个数据库(包含4,297,481个断言)之后,我们将提取的内容与SemMedDB中最近FDA批准的药物进行比较。我们发现,由于SPL的冗余性,SPL与文献之间的新颖性是细微的。我们强调了改进的领域和未来的工作,得出结论认为SPL包含了大量与研究相关的新颖知识,是对文献的补充。