European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK.
School of Biological Sciences, Seoul National University, Seoul, South Korea.
Nucleic Acids Res. 2024 Jan 5;52(D1):D368-D375. doi: 10.1093/nar/gkad1011.
The AlphaFold Database Protein Structure Database (AlphaFold DB, https://alphafold.ebi.ac.uk) has significantly impacted structural biology by amassing over 214 million predicted protein structures, expanding from the initial 300k structures released in 2021. Enabled by the groundbreaking AlphaFold2 artificial intelligence (AI) system, the predictions archived in AlphaFold DB have been integrated into primary data resources such as PDB, UniProt, Ensembl, InterPro and MobiDB. Our manuscript details subsequent enhancements in data archiving, covering successive releases encompassing model organisms, global health proteomes, Swiss-Prot integration, and a host of curated protein datasets. We detail the data access mechanisms of AlphaFold DB, from direct file access via FTP to advanced queries using Google Cloud Public Datasets and the programmatic access endpoints of the database. We also discuss the improvements and services added since its initial release, including enhancements to the Predicted Aligned Error viewer, customisation options for the 3D viewer, and improvements in the search engine of AlphaFold DB.
AlphaFold 数据库蛋白质结构数据库(AlphaFold DB,https://alphafold.ebi.ac.uk)通过积累超过 2.14 亿个预测蛋白质结构,从 2021 年最初发布的 30 万个结构扩展,对结构生物学产生了重大影响。借助开创性的 AlphaFold2 人工智能(AI)系统,AlphaFold DB 中存档的预测已被整合到 PDB、UniProt、Ensembl、InterPro 和 MobiDB 等主要数据资源中。我们的手稿详细介绍了数据归档方面的后续增强,涵盖了包括模式生物、全球健康蛋白质组、Swiss-Prot 集成以及一系列精选蛋白质数据集在内的连续版本。我们详细介绍了 AlphaFold DB 的数据访问机制,包括通过 FTP 进行直接文件访问,以及使用 Google Cloud Public Datasets 和数据库的编程访问端点进行高级查询。我们还讨论了自初始发布以来添加的改进和服务,包括对预测对齐误差查看器的增强、3D 查看器的自定义选项以及 AlphaFold DB 搜索引擎的改进。