Swapna G V T, Dube Namita, Roth Monica J, Montelione Gaetano T
Dept. of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA.
Department of Pharmacology, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway NJ 08854 USA.
bioRxiv. 2024 Dec 16:2024.07.15.603529. doi: 10.1101/2024.07.15.603529.
The Solute Carrier (SLC) superfamily of integral membrane proteins function to transport a wide array of small molecules across plasma and organelle membranes. SLC proteins also function as important drug transporters and as viral receptors. Despite being classified as a single superfamily, SLC proteins do not share a single common fold classification; however, most belong to multi-pass transmembrane helical protein fold families. SLC proteins populate different conformational states during the solute transport process, including outward-open, intermediate (occluded), and inward-open conformational states. For some SLC fold families this structural "flipping" corresponds to swapping between conformations of their N-terminal and C-terminal symmetry-related sub-structures. Conventional AlphaFold2, AlphaFold3, or Evolutionary Scale Modeling methods typically generate models for only one of these multiple conformational states of SLC proteins. Several modifications of these AI-based protocols for modeling multiple conformational states of proteins have been described recently. These methods are often impacted by "memorization" of one of the alternative conformational states, and do not always provide both the inward and outward facing conformations of SLC proteins. Here we describe a combined ESM - template-based-modeling process, based on a previously described template-based modeling method that relies on the internal pseudo-symmetry of many SLC proteins, to consistently model alternate conformational states of SLC proteins. We further demonstrate how the resulting multi-state models can be validated experimentally by comparison with sequence-based evolutionary co-variance data (ECs) that encode information about contacts present in the various conformational states adopted by the protein. This simple, rapid, and robust approach for modeling conformational landscapes of pseudo-symmetric SLC proteins is demonstrated for several integral membrane protein transporters, including SLC35F2 the receptor of a feline leukemia virus envelope protein required for viral entry into eukaryotic cells.
溶质载体(SLC)家族的整合膜蛋白负责将多种小分子转运穿过质膜和细胞器膜。SLC蛋白还作为重要的药物转运体和病毒受体发挥作用。尽管被归类为一个单一的超家族,但SLC蛋白并不共享单一的常见折叠分类;然而,大多数属于多次跨膜螺旋蛋白折叠家族。SLC蛋白在溶质转运过程中呈现不同的构象状态,包括外向开放、中间(封闭)和内向开放构象状态。对于一些SLC折叠家族来说,这种结构“翻转”对应于其N端和C端对称相关子结构构象之间的互换。传统的AlphaFold2、AlphaFold3或进化尺度建模方法通常只为SLC蛋白的这些多种构象状态之一生成模型。最近已经描述了几种基于人工智能的蛋白质多构象状态建模协议的改进方法。这些方法常常受到其中一种替代构象状态“记忆”的影响,并不总是能提供SLC蛋白的内向和外向构象。在这里,我们描述了一种基于之前描述的基于模板的建模方法的ESM - 基于模板建模的组合过程,该方法依赖于许多SLC蛋白的内部伪对称性,以一致地模拟SLC蛋白的交替构象状态。我们进一步展示了如何通过与基于序列的进化协方差数据(ECs)进行比较,来实验验证所得的多状态模型,这些数据编码了关于蛋白质所采用的各种构象状态中存在的接触信息。这种用于模拟伪对称SLC蛋白构象景观的简单、快速且稳健的方法,已在几种整合膜蛋白转运体上得到验证,包括SLC35F2,它是猫白血病病毒包膜蛋白进入真核细胞所需的受体。