Piehl Dennis W, Vallat Brinda, Truong Ivana, Morsy Habiba, Bhatt Rusham, Blaumann Santiago, Biswas Pratyoy, Rose Yana, Bittrich Sebastian, Duarte Jose M, Segura Joan, Bi Chunxiao, Myers-Turnbull Douglas, Hudson Brian P, Zardecki Christine, Burley Stephen K
Research Collaboratory for Structural Bioinformatics Protein Data Bank and the Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA.
Research Collaboratory for Structural Bioinformatics Protein Data Bank and the Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA; Rutgers Cancer Institute, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901, USA.
J Mol Biol. 2025 Jan 31:168970. doi: 10.1016/j.jmb.2025.168970.
The Protein Data Bank (PDB) was founded in 1971 as the first open-access digital data resource in biology to serve as the single global archive for three-dimensional (3D) macromolecular structure data. Current PDB holdings exceed 230,000 experimentally determined structures of proteins, nucleic acids, viruses, and macromolecular machines. The RCSB Protein Data Bank RCSB.org research-focused web portal facilitates search, analyses, and visualization of every PDB structure along with more than one million Computed Structure Models from AlphaFold DB and the ModelArchive. It is powered by a set of publicly available Application Programming Interfaces (APIs) that both support RCSB.org users and provide programmatic access to PDB data. Given the breadth and levels of granularity encompassed in this rich data collection, efficiently accessing the information programmatically may be challenging for new users. RCSB PDB has developed a Python software package, rcsb-api, that facilitates easy and efficient use of RCSB PDB APIs within a Python environment. This software tool is designed to streamline access to the extensive corpus of data housed within the PDB, enabling researchers to search, retrieve, and analyze 3D biostructure data seamlessly. Its use will accelerate research in structural biology, molecular biology and biochemistry, drug discovery, and bioinformatics by providing more efficient tools for data integration and analysis. The new toolkit is available on GitHub (github.com/rcsb/py-rcsb-api) and published to the public Python package repository (PyPI) to foster wider usage and support basic and applied research in fundamental biology, biomedicine, and the energy sciences.
蛋白质数据库(PDB)成立于1971年,是生物学领域首个开放获取的数字数据资源,作为三维(3D)大分子结构数据的单一全球存档库。目前PDB收录的实验确定结构超过23万个,涵盖蛋白质、核酸、病毒和大分子机器。RCSB蛋白质数据库的RCSB.org研究型门户网站便于对每个PDB结构以及来自AlphaFold数据库和模型存档库的100多万个计算结构模型进行搜索、分析和可视化。它由一组公开可用的应用程序编程接口(API)提供支持,这些API既服务于RCSB.org的用户,又提供对PDB数据的编程访问。鉴于这个丰富的数据集中包含的广度和粒度级别,新用户以编程方式有效访问信息可能具有挑战性。RCSB PDB开发了一个Python软件包rcsb - api,便于在Python环境中轻松高效地使用RCSB PDB API。这个软件工具旨在简化对PDB中大量数据的访问,使研究人员能够无缝搜索、检索和分析3D生物结构数据。通过提供更高效的数据集成和分析工具,它的使用将加速结构生物学、分子生物学和生物化学、药物发现以及生物信息学等领域的研究。这个新工具包可在GitHub(github.com/rcsb/py - rcsb - api)上获取,并已发布到公共Python软件包存储库(PyPI),以促进更广泛的使用,并支持基础生物学、生物医学和能源科学领域的基础研究和应用研究。