Antonov Alexey V, Dietmann Sabine, Wong Philip, Igor Rodchenkov, Mewes Hans W
Helmholtz Zentrum München-German Research Center for Environmental Health (GmbH), Institute for Bioinformatics and System Biology, Ingolstadter Landstrasse 1, D-85764 Neuherberg, Germany.
J Proteome Res. 2009 Mar;8(3):1193-7. doi: 10.1021/pr800804d.
The spectrum of problems covered by proteomics studies range from the discovery of compartment specific cell proteomes to clinical applications, including the identification of diagnostic markers and monitoring the effects of drug treatments. In most cases, the ultimate results of a proteomics study are lists of proteins found to be present (or differentially present) at cell physiological conditions under study. Normally, the results are published directly in the article in one or several tables. In many cases, this type of information remains disseminated in hundreds of proteomics publications. We have developed a Web mining tool which allows the collection of this information by searching through full text papers and automatically selecting tables, which report a list of protein identifiers. By searching through major proteomics journals, we have collected approximately 800 independent studies published recently, which reported about 1000 different protein lists. On the basis of this data, we developed a computational tool PLIPS (Protein Lists Identified in Proteomics Studies). PLIPS accepts as input a list of protein/gene identifiers. With the use of statistical analyses, PLIPS infers recently published proteomics studies, which report protein lists that significantly intersect with a query list. PLIPS is a freely available Web-based tool ( http://mips.helmholtz-muenchen.de/proj/plips ).
蛋白质组学研究涵盖的问题范围广泛,从特定细胞区室蛋白质组的发现到临床应用,包括诊断标志物的鉴定以及监测药物治疗效果。在大多数情况下,蛋白质组学研究的最终结果是在研究的细胞生理条件下发现存在(或差异存在)的蛋白质列表。通常,这些结果会直接在文章中的一个或几个表格中公布。在许多情况下,这类信息分散在数百篇蛋白质组学出版物中。我们开发了一种网络挖掘工具,通过在全文论文中进行搜索并自动选择报告蛋白质标识符列表的表格,来收集此类信息。通过搜索主要的蛋白质组学期刊,我们收集了大约800项最近发表的独立研究,这些研究报告了约1000个不同的蛋白质列表。基于这些数据,我们开发了一种计算工具PLIPS(蛋白质组学研究中鉴定的蛋白质列表)。PLIPS接受蛋白质/基因标识符列表作为输入。通过使用统计分析,PLIPS推断最近发表的蛋白质组学研究,这些研究报告的蛋白质列表与查询列表有显著交集。PLIPS是一个免费的基于网络的工具(http://mips.helmholtz-muenchen.de/proj/plips )。