Abriata Luciano A
Brief Bioinform. 2017 Jul 1;18(4):659-669. doi: 10.1093/bib/bbw049.
This Briefing reviews the widely used, currently active, up-to-date databases derived from the worldwide Protein Data Bank (PDB) to facilitate browsing, finding and exploring its entries. These databases contain visualization and analysis tools tailored to specific kinds of molecules and interactions, often including also complex metrics precomputed by experts or external programs, and connections to sequence and functional annotation databases. Importantly, updates of most of these databases involves steps of curation and error checks based on specific expertise about the subject molecules or interactions, and removal of sequence redundancy, both leading to better data sets for mining studies compared with the full list of raw PDB entries. The article presents the databases in groups such as those aimed to facilitate browsing through PDB entries, their molecules and their general information, those built to link protein structure with sequence and dynamics, those specific for transmembrane proteins, nucleic acids, interactions of biomacromolecules with each other and with small molecules or metal ions, and those concerning specific structural features or specific protein families. A few webservers directly connected to active databases, and a few databases that have been discontinued but would be important to have back, are also briefly commented on. Along the Briefing, sample cases where these databases have been used to aid structural studies or advance our knowledge about biological macromolecules are referenced. A few specific examples are also given where using these databases is easier and more informative than using raw PDB data.
本简报回顾了源自全球蛋白质数据库(PDB)的广泛使用、当前活跃且最新的数据库,以方便浏览、查找和探索其条目。这些数据库包含针对特定类型分子和相互作用量身定制的可视化和分析工具,通常还包括专家或外部程序预先计算的复杂指标,以及与序列和功能注释数据库的链接。重要的是,这些数据库中的大多数更新都涉及基于对主题分子或相互作用的特定专业知识进行整理和错误检查的步骤,以及去除序列冗余,与原始PDB条目的完整列表相比,这两者都能产生更适合挖掘研究的数据集。本文按类别介绍了这些数据库,例如旨在方便浏览PDB条目、其分子及其一般信息的数据库,旨在将蛋白质结构与序列和动力学联系起来的数据库,针对跨膜蛋白、核酸、生物大分子彼此之间以及与小分子或金属离子相互作用的特定数据库,以及涉及特定结构特征或特定蛋白质家族的数据库。还简要介绍了一些直接连接到活跃数据库的网络服务器,以及一些已停止使用但恢复后会很重要的数据库。在整个简报过程中,引用了这些数据库用于辅助结构研究或增进我们对生物大分子认识的示例案例。还给出了一些具体例子,说明使用这些数据库比使用原始PDB数据更容易且信息更丰富。