Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, USA.
Proteins. 2022 Nov;90(11):1873-1885. doi: 10.1002/prot.26382. Epub 2022 May 16.
The family of G-protein coupled receptors (GPCRs) is one of the largest protein families in the human genome. GPCRs transduct chemical signals from extracellular to intracellular regions via a conformational switch between active and inactive states upon ligand binding. While experimental structures of GPCRs remain limited, high-accuracy computational predictions are now possible with AlphaFold2. However, AlphaFold2 only predicts one state and is biased toward either the active or inactive conformation depending on the GPCR class. Here, a multi-state prediction protocol is introduced that extends AlphaFold2 to predict either active or inactive states at very high accuracy using state-annotated templated GPCR databases. The predicted models accurately capture the main structural changes upon activation of the GPCR at the atomic level. For most of the benchmarked GPCRs (10 out of 15), models in the active and inactive states were closer to their corresponding activation state structures. Median RMSDs of the transmembrane regions were 1.12 Å and 1.41 Å for the active and inactive state models, respectively. The models were more suitable for protein-ligand docking than the original AlphaFold2 models and template-based models. Finally, our prediction protocol predicted accurate GPCR structures and GPCR-peptide complex structures in GPCR Dock 2021, a blind GPCR-ligand complex modeling competition. We expect that high accuracy GPCR models in both activation states will promote understanding in GPCR activation mechanisms and drug discovery for GPCRs. At the time, the new protocol paves the way towards capturing the dynamics of proteins at high-accuracy via machine-learning methods.
G 蛋白偶联受体(GPCR)家族是人类基因组中最大的蛋白质家族之一。GPCR 通过配体结合后在活性和非活性状态之间的构象转换,将细胞外的化学信号转导到细胞内区域。虽然 GPCR 的实验结构仍然有限,但现在使用 AlphaFold2 可以进行高精度的计算预测。然而,AlphaFold2 仅预测一种状态,并且根据 GPCR 类别偏向于活性或非活性构象。在这里,引入了一种多态预测方案,该方案使用具有状态注释的模板 GPCR 数据库,将 AlphaFold2 扩展为以非常高的精度预测活性或非活性状态。预测模型在原子水平上准确地捕捉到 GPCR 激活时的主要结构变化。对于大多数经过基准测试的 GPCR(15 个中的 10 个),活性和非活性状态下的模型更接近其相应的激活状态结构。跨膜区域的中位数 RMSD 分别为 1.12Å 和 1.41Å。与原始的 AlphaFold2 模型和基于模板的模型相比,这些模型更适合于蛋白-配体对接。最后,我们的预测方案在 GPCR Dock 2021 盲 GPCR-配体复合物建模竞赛中预测了准确的 GPCR 结构和 GPCR-肽复合物结构。我们预计,在两种激活状态下具有高精度的 GPCR 模型将促进对 GPCR 激活机制的理解和 GPCR 的药物发现。同时,新方案为通过机器学习方法以高精度捕获蛋白质动力学铺平了道路。