Suppr超能文献

为创建癌症药物毒性知识库:从文献中自动提取癌症药物-副作用关系。

Toward creation of a cancer drug toxicity knowledge base: automatically extracting cancer drug-side effect relationships from the literature.

机构信息

Medical Informatics Program, Center for Clinical Investigation, Case Western Reserve University, Cleveland, Ohio, USA.

出版信息

J Am Med Inform Assoc. 2014 Jan-Feb;21(1):90-6. doi: 10.1136/amiajnl-2012-001584. Epub 2013 May 18.

Abstract

OBJECTIVE

A comprehensive and machine-understandable cancer drug-side effect (drug-SE) relationship knowledge base is important for in silico cancer drug target discovery, drug repurposing, and toxicity predication, and for personalized risk-benefit decisions by cancer patients. While US Food and Drug Administration (FDA) drug labels capture well-known cancer drug SE information, much cancer drug SE knowledge remains buried the published biomedical literature. We present a relationship extraction approach to extract cancer drug-SE pairs from the literature.

DATA AND METHODS

We used 21,354,075 MEDLINE records as the text corpus. We extracted drug-SE co-occurrence pairs using a cancer drug lexicon and a clean SE lexicon that we created. We then developed two filtering approaches to remove drug-disease treatment pairs and subsequently a ranking scheme to further prioritize filtered pairs. Finally, we analyzed relationships among SEs, gene targets, and indications.

RESULTS

We extracted 56,602 cancer drug-SE pairs. The filtering algorithms improved the precision of extracted pairs from 0.252 at baseline to 0.426, representing a 69% improvement in precision with no decrease in recall. The ranking algorithm further prioritized filtered pairs and achieved a precision of 0.778 for top-ranked pairs. We showed that cancer drugs that share SEs tend to have overlapping gene targets and overlapping indications.

CONCLUSIONS

The relationship extraction approach is effective in extracting many cancer drug-SE pairs from the literature. This unique knowledge base, when combined with existing cancer drug SE knowledge, can facilitate drug target discovery, drug repurposing, and toxicity prediction.

摘要

目的

全面且可被机器理解的癌症药物副作用(药物-SE)关系知识库对于计算机辅助癌症药物靶点发现、药物再利用以及毒性预测,以及癌症患者的个性化风险-获益决策都非常重要。虽然美国食品和药物管理局(FDA)的药物标签能够很好地捕捉到已知的癌症药物 SE 信息,但许多癌症药物 SE 知识仍隐藏在已发表的生物医学文献中。我们提出了一种关系提取方法,从文献中提取癌症药物-SE 对。

数据和方法

我们使用了 21354075 条 MEDLINE 记录作为文本语料库。我们使用癌症药物词典和我们创建的干净 SE 词典提取药物-SE 共现对。然后,我们开发了两种过滤方法来去除药物-疾病治疗对,随后使用一种排名方案来进一步优先考虑过滤后的对。最后,我们分析了 SE、基因靶点和适应症之间的关系。

结果

我们提取了 56602 对癌症药物-SE。过滤算法将提取对的精度从基线时的 0.252 提高到 0.426,精度提高了 69%,而召回率没有下降。排名算法进一步对过滤后的对进行了优先级排序,对于排名靠前的对,精度达到了 0.778。我们表明,具有共同 SE 的癌症药物往往具有重叠的基因靶点和重叠的适应症。

结论

关系提取方法能够有效地从文献中提取出许多癌症药物-SE 对。这个独特的知识库,与现有的癌症药物 SE 知识相结合,可以促进药物靶点发现、药物再利用以及毒性预测。

相似文献

1
Toward creation of a cancer drug toxicity knowledge base: automatically extracting cancer drug-side effect relationships from the literature.
J Am Med Inform Assoc. 2014 Jan-Feb;21(1):90-6. doi: 10.1136/amiajnl-2012-001584. Epub 2013 May 18.
3
Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature.
J Biomed Inform. 2014 Oct;51:191-9. doi: 10.1016/j.jbi.2014.05.013. Epub 2014 Jun 10.
5
Towards building a disease-phenotype knowledge base: extracting disease-manifestation relationship from literature.
Bioinformatics. 2013 Sep 1;29(17):2186-94. doi: 10.1093/bioinformatics/btt359. Epub 2013 Jul 4.
6
Large-scale automatic extraction of side effects associated with targeted anticancer drugs from full-text oncological articles.
J Biomed Inform. 2015 Jun;55:64-72. doi: 10.1016/j.jbi.2015.03.009. Epub 2015 Mar 27.
10
A knowledge-driven conditional approach to extract pharmacogenomics specific drug-gene relationships from free text.
J Biomed Inform. 2012 Oct;45(5):827-34. doi: 10.1016/j.jbi.2012.04.011. Epub 2012 Apr 27.

引用本文的文献

1
approaches for drug repurposing in oncology: a scoping review.
Front Pharmacol. 2024 Jun 11;15:1400029. doi: 10.3389/fphar.2024.1400029. eCollection 2024.
2
Constructing a knowledge-based heterogeneous information graph for medical health status classification.
Health Inf Sci Syst. 2020 Feb 14;8(1):10. doi: 10.1007/s13755-020-0100-6. eCollection 2020 Dec.
3
Immunotherapy-related adverse events (irAEs): extraction from FDA drug labels and comparative analysis.
JAMIA Open. 2019 Apr;2(1):173-178. doi: 10.1093/jamiaopen/ooy045. Epub 2018 Oct 15.
4
tcTKB: an integrated cardiovascular toxicity knowledge base for targeted cancer drugs.
AMIA Annu Symp Proc. 2015 Nov 5;2015:1342-51. eCollection 2015.
5
PubMedMiner: Mining and Visualizing MeSH-based Associations in PubMed.
AMIA Annu Symp Proc. 2014 Nov 14;2014:1990-9. eCollection 2014.
7
Large-scale automatic extraction of side effects associated with targeted anticancer drugs from full-text oncological articles.
J Biomed Inform. 2015 Jun;55:64-72. doi: 10.1016/j.jbi.2015.03.009. Epub 2015 Mar 27.
8
Computational advances in cancer informatics (a).
Cancer Inform. 2014 Oct 13;13(Suppl 1):45-8. doi: 10.4137/CIN.S19243. eCollection 2014.
9
Big data: the next frontier for innovation in therapeutics and healthcare.
Expert Rev Clin Pharmacol. 2014 May;7(3):293-8. doi: 10.1586/17512433.2014.905201. Epub 2014 Apr 7.

本文引用的文献

1
Design and validation of an automated method to detect known adverse drug reactions in MEDLINE: a contribution from the EU-ADR project.
J Am Med Inform Assoc. 2013 May 1;20(3):446-52. doi: 10.1136/amiajnl-2012-001083. Epub 2012 Nov 29.
2
Large-scale prediction of adverse drug reactions using chemical, biological, and phenotypic properties of drugs.
J Am Med Inform Assoc. 2012 Jun;19(e1):e28-35. doi: 10.1136/amiajnl-2011-000699.
3
A knowledge-driven conditional approach to extract pharmacogenomics specific drug-gene relationships from free text.
J Biomed Inform. 2012 Oct;45(5):827-34. doi: 10.1016/j.jbi.2012.04.011. Epub 2012 Apr 27.
5
Using information mining of the medical literature to improve drug safety.
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):668-74. doi: 10.1136/amiajnl-2011-000096. Epub 2011 May 5.
6
A side effect resource to capture phenotypic effects of drugs.
Mol Syst Biol. 2010;6:343. doi: 10.1038/msb.2009.98. Epub 2010 Jan 19.
7
Accelerated approval of cancer drugs: improved access to therapeutic breakthroughs or early release of unsafe and ineffective drugs?
J Clin Oncol. 2009 Sep 10;27(26):4398-405. doi: 10.1200/JCO.2008.21.1961. Epub 2009 Jul 27.
8
Data completeness--the Achilles heel of drug-target networks.
Nat Biotechnol. 2008 Sep;26(9):983-4. doi: 10.1038/nbt0908-983.
9
Drug target identification using side-effect similarity.
Science. 2008 Jul 11;321(5886):263-6. doi: 10.1126/science.1158140.
10
DrugBank: a knowledgebase for drugs, drug actions and drug targets.
Nucleic Acids Res. 2008 Jan;36(Database issue):D901-6. doi: 10.1093/nar/gkm958. Epub 2007 Nov 29.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验