Suppr超能文献

PolySearch:一个基于网络的文本挖掘系统,用于提取人类疾病、基因、突变、药物和代谢物之间的关系。

PolySearch: a web-based text mining system for extracting relationships between human diseases, genes, mutations, drugs and metabolites.

作者信息

Cheng Dean, Knox Craig, Young Nelson, Stothard Paul, Damaraju Sambasivarao, Wishart David S

机构信息

Department of Computing Science, University of Alberta, Canada.

出版信息

Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W399-405. doi: 10.1093/nar/gkn296. Epub 2008 May 16.

Abstract

A particular challenge in biomedical text mining is to find ways of handling 'comprehensive' or 'associative' queries such as 'Find all genes associated with breast cancer'. Given that many queries in genomics, proteomics or metabolomics involve these kind of comprehensive searches we believe that a web-based tool that could support these searches would be quite useful. In response to this need, we have developed the PolySearch web server. PolySearch supports >50 different classes of queries against nearly a dozen different types of text, scientific abstract or bioinformatic databases. The typical query supported by PolySearch is 'Given X, find all Y's' where X or Y can be diseases, tissues, cell compartments, gene/protein names, SNPs, mutations, drugs and metabolites. PolySearch also exploits a variety of techniques in text mining and information retrieval to identify, highlight and rank informative abstracts, paragraphs or sentences. PolySearch's performance has been assessed in tasks such as gene synonym identification, protein-protein interaction identification and disease gene identification using a variety of manually assembled 'gold standard' text corpuses. Its f-measure on these tasks is 88, 81 and 79%, respectively. These values are between 5 and 50% better than other published tools. The server is freely available at http://wishart.biology.ualberta.ca/polysearch.

摘要

生物医学文本挖掘中的一个特殊挑战是找到处理“综合性”或“关联性”查询的方法,例如“找出所有与乳腺癌相关的基因”。鉴于基因组学、蛋白质组学或代谢组学中的许多查询都涉及这类综合性搜索,我们认为一个能够支持这些搜索的基于网络的工具会非常有用。为满足这一需求,我们开发了PolySearch网络服务器。PolySearch支持针对近十二种不同类型的文本、科学摘要或生物信息数据库进行50多种不同类型的查询。PolySearch支持的典型查询是“给定X,找出所有的Y”,其中X或Y可以是疾病、组织、细胞区室、基因/蛋白质名称、单核苷酸多态性(SNP)、突变、药物和代谢物。PolySearch还利用文本挖掘和信息检索中的多种技术来识别、突出显示和排列信息丰富的摘要、段落或句子。已经使用各种人工汇编的“黄金标准”文本语料库,在基因同义词识别、蛋白质 - 蛋白质相互作用识别和疾病基因识别等任务中评估了PolySearch的性能。它在这些任务上的F值分别为88%、81%和79%。这些值比其他已发表的工具高出5%至50%。该服务器可在http://wishart.biology.ualberta.ca/polysearch免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/45b5/2447794/71430ec1b8aa/gkn296f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验