Suppr超能文献

使用 PharmGKB 来训练文本挖掘方法,以确定药物基因组学研究的潜在基因靶点。

Using PharmGKB to train text mining approaches for identifying potential gene targets for pharmacogenomic studies.

机构信息

College of Pharmacy, University of Minnesota, Minneapolis, MN 55455, USA.

出版信息

J Biomed Inform. 2012 Oct;45(5):862-9. doi: 10.1016/j.jbi.2012.04.007. Epub 2012 May 4.

Abstract

The main objective of this study was to investigate the feasibility of using PharmGKB, a pharmacogenomic database, as a source of training data in combination with text of MEDLINE abstracts for a text mining approach to identification of potential gene targets for pathway-driven pharmacogenomics research. We used the manually curated relations between drugs and genes in PharmGKB database to train a support vector machine predictive model and applied this model prospectively to MEDLINE abstracts. The gene targets suggested by this approach were subsequently manually reviewed. Our quantitative analysis showed that a support vector machine classifiers trained on MEDLINE abstracts with single words (unigrams) used as features and PharmGKB relations used for supervision, achieve an overall sensitivity of 85% and specificity of 69%. The subsequent qualitative analysis showed that gene targets "suggested" by the automatic classifier were not anticipated by expert reviewers but were subsequently found to be relevant to the three drugs that were investigated: carbamazepine, lamivudine and zidovudine. Our results show that this approach is not only feasible but may also find new gene targets not identifiable by other methods thus making it a valuable tool for pathway-driven pharmacogenomics research.

摘要

本研究的主要目的是探讨利用 PharmGKB(一个药物基因组学数据库)作为训练数据来源,并结合 MEDLINE 摘要文本,采用文本挖掘方法识别潜在的基因靶点,以进行基于通路的药物基因组学研究。我们使用 PharmGKB 数据库中药物和基因之间的人工整理关系来训练支持向量机预测模型,并将该模型前瞻性地应用于 MEDLINE 摘要。随后,我们对该方法建议的基因靶点进行了人工审查。我们的定量分析表明,在 MEDLINE 摘要中,使用单个单词(unigrams)作为特征,使用 PharmGKB 关系进行监督的支持向量机分类器,其总体灵敏度为 85%,特异性为 69%。随后的定性分析表明,自动分类器“建议”的基因靶点并未被专家评审员预料到,但后来发现与三种药物有关:卡马西平、拉米夫定和齐多夫定。我们的结果表明,这种方法不仅可行,而且还可能发现其他方法无法识别的新基因靶点,因此成为基于通路的药物基因组学研究的一种有价值的工具。

相似文献

1
Using PharmGKB to train text mining approaches for identifying potential gene targets for pharmacogenomic studies.
J Biomed Inform. 2012 Oct;45(5):862-9. doi: 10.1016/j.jbi.2012.04.007. Epub 2012 May 4.
2
A mutation-centric approach to identifying pharmacogenomic relations in text.
J Biomed Inform. 2012 Oct;45(5):835-41. doi: 10.1016/j.jbi.2012.05.003. Epub 2012 Jun 7.
3
Relation mining experiments in the pharmacogenomics domain.
J Biomed Inform. 2012 Oct;45(5):851-61. doi: 10.1016/j.jbi.2012.04.014. Epub 2012 May 10.
4
PGxMine: Text mining for curation of PharmGKB.
Pac Symp Biocomput. 2020;25:611-622.
5
A knowledge-driven conditional approach to extract pharmacogenomics specific drug-gene relationships from free text.
J Biomed Inform. 2012 Oct;45(5):827-34. doi: 10.1016/j.jbi.2012.04.011. Epub 2012 Apr 27.
7
PharmGKB, an Integrated Resource of Pharmacogenomic Knowledge.
Curr Protoc. 2021 Aug;1(8):e226. doi: 10.1002/cpz1.226.
8
Pharmacogenomics and bioinformatics: PharmGKB.
Pharmacogenomics. 2010 Apr;11(4):501-5. doi: 10.2217/pgs.10.15.
9
Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text.
BMC Bioinformatics. 2009 Feb 5;10 Suppl 2(Suppl 2):S6. doi: 10.1186/1471-2105-10-S2-S6.
10
Learning the Structure of Biomedical Relationships from Unstructured Text.
PLoS Comput Biol. 2015 Jul 28;11(7):e1004216. doi: 10.1371/journal.pcbi.1004216. eCollection 2015 Jul.

引用本文的文献

1
Text Mining Protocol to Retrieve Significant Drug-Gene Interactions from PubMed Abstracts.
Methods Mol Biol. 2022;2496:17-39. doi: 10.1007/978-1-0716-2305-3_2.
2
Extracting Concepts for Precision Oncology from the Biomedical Literature.
AMIA Jt Summits Transl Sci Proc. 2021 May 17;2021:276-285. eCollection 2021.
3
PharmGKB, an Integrated Resource of Pharmacogenomic Knowledge.
Curr Protoc. 2021 Aug;1(8):e226. doi: 10.1002/cpz1.226.
4
PGxCorpus, a manually annotated corpus for pharmacogenomics.
Sci Data. 2020 Jan 2;7(1):3. doi: 10.1038/s41597-019-0342-9.
5
eGARD: Extracting associations between genomic anomalies and drug responses from text.
PLoS One. 2017 Dec 20;12(12):e0189663. doi: 10.1371/journal.pone.0189663. eCollection 2017.
8
Discovery of novel biomarkers and phenotypes by semantic technologies.
BMC Bioinformatics. 2013 Feb 13;14:51. doi: 10.1186/1471-2105-14-51.
9
Drug target inference through pathway analysis of genomics data.
Adv Drug Deliv Rev. 2013 Jun 30;65(7):966-72. doi: 10.1016/j.addr.2012.12.004. Epub 2013 Jan 28.

本文引用的文献

1
Recent progress in automatically extracting information from the pharmacogenomic literature.
Pharmacogenomics. 2010 Oct;11(10):1467-89. doi: 10.2217/pgs.10.136.
2
Using text to build semantic networks for pharmacogenomics.
J Biomed Inform. 2010 Dec;43(6):1009-19. doi: 10.1016/j.jbi.2010.08.005. Epub 2010 Aug 17.
3
Improving the prediction of pharmacogenes using text-derived drug-gene relationships.
Pac Symp Biocomput. 2010:305-14. doi: 10.1142/9789814295291_0033.
4
Generating genome-scale candidate gene lists for pharmacogenomics.
Clin Pharmacol Ther. 2009 Aug;86(2):183-9. doi: 10.1038/clpt.2009.42. Epub 2009 Apr 15.
5
Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text.
BMC Bioinformatics. 2009 Feb 5;10 Suppl 2(Suppl 2):S6. doi: 10.1186/1471-2105-10-S2-S6.
6
Hospital admissions associated with adverse drug reactions: a systematic review of prospective observational studies.
Ann Pharmacother. 2008 Jul;42(7):1017-25. doi: 10.1345/aph.1L037. Epub 2008 Jul 1.
7
Gene symbol disambiguation using knowledge-based profiles.
Bioinformatics. 2007 Apr 15;23(8):1015-22. doi: 10.1093/bioinformatics/btm056. Epub 2007 Feb 21.
10
Inheritance and drug response.
N Engl J Med. 2003 Feb 6;348(6):529-37. doi: 10.1056/NEJMra020021.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验