Suppr超能文献

在从全文文章中提取抗癌药物-副作用对时结合自动表格分类和关系提取

Combining automatic table classification and relationship extraction in extracting anticancer drug-side effect pairs from full-text articles.

作者信息

Xu Rong, Wang QuanQiu

机构信息

Medical Informatics Program, Center for Clinical Investigation, Case Western Reserve University, Cleveland, OH 44106, United States.

ThinTek, LLC, Palo Alto, CA 94306, United States.

出版信息

J Biomed Inform. 2015 Feb;53:128-35. doi: 10.1016/j.jbi.2014.10.002. Epub 2014 Oct 13.

Abstract

Anticancer drug-associated side effect knowledge often exists in multiple heterogeneous and complementary data sources. A comprehensive anticancer drug-side effect (drug-SE) relationship knowledge base is important for computation-based drug target discovery, drug toxicity predication and drug repositioning. In this study, we present a two-step approach by combining table classification and relationship extraction to extract drug-SE pairs from a large number of high-profile oncological full-text articles. The data consists of 31,255 tables downloaded from the Journal of Oncology (JCO). We first trained a statistical classifier to classify tables into SE-related and -unrelated categories. We then extracted drug-SE pairs from SE-related tables. We compared drug side effect knowledge extracted from JCO tables to that derived from FDA drug labels. Finally, we systematically analyzed relationships between anti-cancer drug-associated side effects and drug-associated gene targets, metabolism genes, and disease indications. The statistical table classifier is effective in classifying tables into SE-related and -unrelated (precision: 0.711; recall: 0.941; F1: 0.810). We extracted a total of 26,918 drug-SE pairs from SE-related tables with a precision of 0.605, a recall of 0.460, and a F1 of 0.520. Drug-SE pairs extracted from JCO tables is largely complementary to those derived from FDA drug labels; as many as 84.7% of the pairs extracted from JCO tables have not been included a side effect database constructed from FDA drug labels. Side effects associated with anticancer drugs positively correlate with drug target genes, drug metabolism genes, and disease indications.

摘要

抗癌药物相关的副作用知识通常存在于多个异构且互补的数据来源中。一个全面的抗癌药物 - 副作用(药物 - SE)关系知识库对于基于计算的药物靶点发现、药物毒性预测和药物重新定位至关重要。在本研究中,我们提出了一种两步法,通过结合表格分类和关系提取,从大量备受瞩目的肿瘤学全文文章中提取药物 - SE对。数据包括从《肿瘤学杂志》(JCO)下载的31,255个表格。我们首先训练了一个统计分类器,将表格分类为与SE相关和不相关的类别。然后,我们从与SE相关的表格中提取药物 - SE对。我们将从JCO表格中提取的药物副作用知识与从FDA药物标签中获得的知识进行了比较。最后,我们系统地分析了抗癌药物相关副作用与药物相关基因靶点、代谢基因和疾病适应症之间的关系。统计表格分类器在将表格分类为与SE相关和不相关方面是有效的(精确率:0.711;召回率:0.941;F1值:0.810)。我们从与SE相关的表格中总共提取了26,918个药物 - SE对,精确率为0.605,召回率为0.460,F1值为0.520。从JCO表格中提取的药物 - SE对在很大程度上与从FDA药物标签中获得的对互补;从JCO表格中提取的对中多达84.7%未包含在由FDA药物标签构建的副作用数据库中。与抗癌药物相关的副作用与药物靶点基因、药物代谢基因和疾病适应症呈正相关。

相似文献

2
Large-scale automatic extraction of side effects associated with targeted anticancer drugs from full-text oncological articles.
J Biomed Inform. 2015 Jun;55:64-72. doi: 10.1016/j.jbi.2015.03.009. Epub 2015 Mar 27.
3
Toward creation of a cancer drug toxicity knowledge base: automatically extracting cancer drug-side effect relationships from the literature.
J Am Med Inform Assoc. 2014 Jan-Feb;21(1):90-6. doi: 10.1136/amiajnl-2012-001584. Epub 2013 May 18.
4
Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature.
J Biomed Inform. 2014 Oct;51:191-9. doi: 10.1016/j.jbi.2014.05.013. Epub 2014 Jun 10.
9
tcTKB: an integrated cardiovascular toxicity knowledge base for targeted cancer drugs.
AMIA Annu Symp Proc. 2015 Nov 5;2015:1342-51. eCollection 2015.
10
Investigating drug repositioning opportunities in FDA drug labels through topic modeling.
BMC Bioinformatics. 2012;13 Suppl 15(Suppl 15):S6. doi: 10.1186/1471-2105-13-S15-S6. Epub 2012 Sep 11.

引用本文的文献

本文引用的文献

1
Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature.
J Biomed Inform. 2014 Oct;51:191-9. doi: 10.1016/j.jbi.2014.05.013. Epub 2014 Jun 10.
4
Web-scale pharmacovigilance: listening to signals from the crowd.
J Am Med Inform Assoc. 2013 May 1;20(3):404-8. doi: 10.1136/amiajnl-2012-001482. Epub 2013 Mar 6.
5
Computational drug repositioning: from data to therapeutics.
Clin Pharmacol Ther. 2013 Apr;93(4):335-41. doi: 10.1038/clpt.2013.1. Epub 2013 Jan 15.
6
Extraction of potential adverse drug events from medical case reports.
J Biomed Semantics. 2012 Dec 20;3(1):15. doi: 10.1186/2041-1480-3-15.
7
Automated design of ligands to polypharmacological profiles.
Nature. 2012 Dec 13;492(7428):215-20. doi: 10.1038/nature11691.
8
Pharmacogenomics knowledge for personalized medicine.
Clin Pharmacol Ther. 2012 Oct;92(4):414-7. doi: 10.1038/clpt.2012.96.
9
Large-scale prediction and testing of drug activity on side-effect targets.
Nature. 2012 Jun 10;486(7403):361-7. doi: 10.1038/nature11159.
10
Novel data-mining methodologies for adverse drug event discovery and analysis.
Clin Pharmacol Ther. 2012 Jun;91(6):1010-21. doi: 10.1038/clpt.2012.50.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验