Suppr超能文献

使用 PharmGKB 来训练文本挖掘方法,以确定药物基因组学研究的潜在基因靶点。

Using PharmGKB to train text mining approaches for identifying potential gene targets for pharmacogenomic studies.

机构信息

College of Pharmacy, University of Minnesota, Minneapolis, MN 55455, USA.

出版信息

J Biomed Inform. 2012 Oct;45(5):862-9. doi: 10.1016/j.jbi.2012.04.007. Epub 2012 May 4.

Abstract

The main objective of this study was to investigate the feasibility of using PharmGKB, a pharmacogenomic database, as a source of training data in combination with text of MEDLINE abstracts for a text mining approach to identification of potential gene targets for pathway-driven pharmacogenomics research. We used the manually curated relations between drugs and genes in PharmGKB database to train a support vector machine predictive model and applied this model prospectively to MEDLINE abstracts. The gene targets suggested by this approach were subsequently manually reviewed. Our quantitative analysis showed that a support vector machine classifiers trained on MEDLINE abstracts with single words (unigrams) used as features and PharmGKB relations used for supervision, achieve an overall sensitivity of 85% and specificity of 69%. The subsequent qualitative analysis showed that gene targets "suggested" by the automatic classifier were not anticipated by expert reviewers but were subsequently found to be relevant to the three drugs that were investigated: carbamazepine, lamivudine and zidovudine. Our results show that this approach is not only feasible but may also find new gene targets not identifiable by other methods thus making it a valuable tool for pathway-driven pharmacogenomics research.

摘要

本研究的主要目的是探讨利用 PharmGKB(一个药物基因组学数据库)作为训练数据来源,并结合 MEDLINE 摘要文本,采用文本挖掘方法识别潜在的基因靶点,以进行基于通路的药物基因组学研究。我们使用 PharmGKB 数据库中药物和基因之间的人工整理关系来训练支持向量机预测模型,并将该模型前瞻性地应用于 MEDLINE 摘要。随后,我们对该方法建议的基因靶点进行了人工审查。我们的定量分析表明,在 MEDLINE 摘要中,使用单个单词(unigrams)作为特征,使用 PharmGKB 关系进行监督的支持向量机分类器,其总体灵敏度为 85%,特异性为 69%。随后的定性分析表明,自动分类器“建议”的基因靶点并未被专家评审员预料到,但后来发现与三种药物有关:卡马西平、拉米夫定和齐多夫定。我们的结果表明,这种方法不仅可行,而且还可能发现其他方法无法识别的新基因靶点,因此成为基于通路的药物基因组学研究的一种有价值的工具。

相似文献

3
Relation mining experiments in the pharmacogenomics domain.药物基因组学领域的关系挖掘实验。
J Biomed Inform. 2012 Oct;45(5):851-61. doi: 10.1016/j.jbi.2012.04.014. Epub 2012 May 10.
8
10
Learning the Structure of Biomedical Relationships from Unstructured Text.从非结构化文本中学习生物医学关系的结构
PLoS Comput Biol. 2015 Jul 28;11(7):e1004216. doi: 10.1371/journal.pcbi.1004216. eCollection 2015 Jul.

引用本文的文献

9
Drug target inference through pathway analysis of genomics data.通过基因组学数据的通路分析进行药物靶点推断。
Adv Drug Deliv Rev. 2013 Jun 30;65(7):966-72. doi: 10.1016/j.addr.2012.12.004. Epub 2013 Jan 28.

本文引用的文献

2
Using text to build semantic networks for pharmacogenomics.利用文本构建药物基因组学的语义网络。
J Biomed Inform. 2010 Dec;43(6):1009-19. doi: 10.1016/j.jbi.2010.08.005. Epub 2010 Aug 17.
7
Gene symbol disambiguation using knowledge-based profiles.使用基于知识的概况进行基因符号消歧。
Bioinformatics. 2007 Apr 15;23(8):1015-22. doi: 10.1093/bioinformatics/btm056. Epub 2007 Feb 21.
10
Inheritance and drug response.遗传与药物反应。
N Engl J Med. 2003 Feb 6;348(6):529-37. doi: 10.1056/NEJMra020021.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验