European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
Database (Oxford). 2013 May 2;2013:bat030. doi: 10.1093/database/bat030. Print 2013.
The extraction of information from the scientific literature is a complex task-for researchers doing manual curation and for automatic text processing solutions. The identification of protein-protein interactions (PPIs) requires the extraction of protein named entities and their relations. Semi-automatic interactive support is one approach to combine both solutions for efficient working processes to generate reliable database content. In principle, the extraction of PPIs can be achieved with different methods that can be combined to deliver high precision and/or high recall results in different combinations at the same time. Interactive use can be achieved, if the analytical methods are fast enough to process the retrieved documents. PCorral provides interactive mining of PPIs from the scientific literature allowing curators to skim MEDLINE for PPIs at low overheads. The keyword query to PCorral steers the selection of documents, and the subsequent text analysis generates high recall and high precision results for the curator. The underlying components of PCorral process the documents on-the-fly and are available, as well, as web service from the Whatizit infrastructure. The human interface summarizes the identified PPI results, and the involved entities are linked to relevant resources and databases. Altogether, PCorral serves curator at both the beginning and the end of the curation workflow for information retrieval and information extraction. Database URL: http://www.ebi.ac.uk/Rebholz-srv/pcorral.
从科学文献中提取信息是一项复杂的任务——无论是对于从事人工整理的研究人员,还是对于自动文本处理解决方案来说都是如此。蛋白质-蛋白质相互作用(PPIs)的识别需要提取蛋白质命名实体及其关系。半自动化交互支持是结合这两种解决方案的一种方法,可以实现高效的工作流程,从而生成可靠的数据库内容。原则上,可以使用不同的方法来提取 PPIs,这些方法可以组合使用,以在不同的组合中同时获得高精度和/或高召回率的结果。如果分析方法足够快,可以处理检索到的文档,就可以实现交互使用。PCorral 提供了从科学文献中交互式挖掘蛋白质-蛋白质相互作用的功能,允许整理者以低开销浏览 MEDLINE 中的蛋白质-蛋白质相互作用。对 PCorral 的关键字查询可以引导文档的选择,随后的文本分析为整理者生成高召回率和高精度的结果。PCorral 的基础组件可以实时处理文档,并且作为 Whatizit 基础设施的 Web 服务也可以使用。用户界面总结了识别出的 PPI 结果,所涉及的实体与相关资源和数据库相关联。总的来说,PCorral 在信息检索和信息提取的整理工作流程的开始和结束阶段为整理者提供服务。数据库网址:http://www.ebi.ac.uk/Rebholz-srv/pcorral。