State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai 200438, China.
Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad076.
As the FAIR (Findable, Accessible, Interoperable, Reusable) principles have become widely accepted in the proteomics field, under the guidance of ProteomeXchange and The Human Proteome Organization Proteomics Standards Initiative, proteomics public databases have been providing Application Programming Interfaces for programmatic access. Based on generating logic from proteomics data, we present Patpat, an extensible framework for searching public datasets, merging results from multiple databases to help researchers find their proteins of interest in the vast mass spectrometry. Patpat's 2D strategy of combining results from multiple databases allows users to provide only protein identifiers to obtain metadata for relevant datasets, improving the 'Findable' of proteomics data.
The Patpat framework is released under the Apache 2.0 license open source, and the source code is stored on GitHub (https://github.com/henry-leo/Patpat) and is freely available.
Supplementary data are available at Bioinformatics online.
随着 FAIR(可发现、可访问、可互操作、可重用)原则在蛋白质组学领域得到广泛接受,在 ProteomeXchange 和人类蛋白质组组织蛋白质组学标准倡议的指导下,蛋白质组学公共数据库一直在提供用于编程访问的应用程序编程接口。基于蛋白质组学数据的生成逻辑,我们提出了 Patpat,这是一个用于搜索公共数据集的可扩展框架,合并来自多个数据库的结果,以帮助研究人员在大量质谱中找到他们感兴趣的蛋白质。Patpat 将多个数据库的结果结合起来的 2D 策略允许用户仅提供蛋白质标识符,以获取相关数据集的元数据,从而提高蛋白质组学数据的“可发现性”。
Patpat 框架根据 Apache 2.0 许可证以开源形式发布,源代码存储在 GitHub(https://github.com/henry-leo/Patpat)上,可以免费使用。
补充数据可在 Bioinformatics 在线获得。