Laboratoire de Biologie Structurale de la Cellule (BIOC), CNRS, Ecole Polytechnique, Institut Polytechnique de Paris, F-91128 Palaiseau, France.
Acta Crystallogr D Struct Biol. 2022 Apr 1;78(Pt 4):517-531. doi: 10.1107/S2059798322002157. Epub 2022 Mar 16.
The breakthrough recently made in protein structure prediction by deep-learning programs such as AlphaFold and RoseTTAFold will certainly revolutionize biology over the coming decades. The scientific community is only starting to appreciate the various applications, benefits and limitations of these protein models. Yet, after the first thrills due to this revolution, it is important to evaluate the impact of the proposed models and their overall quality to avoid the misinterpretation or overinterpretation of these models by biologists. One of the first applications of these models is in solving the `phase problem' encountered in X-ray crystallography in calculating electron-density maps from diffraction data. Indeed, the most frequently used technique to derive electron-density maps is molecular replacement. As this technique relies on knowledge of the structure of a protein that shares strong structural similarity with the studied protein, the availability of high-accuracy models is then definitely critical for successful structure solution. After the collection of a 2.45 Å resolution data set, we struggled for two years in trying to solve the crystal structure of a protein involved in the nonsense-mediated mRNA decay pathway, an mRNA quality-control pathway dedicated to the elimination of eukaryotic mRNAs harboring premature stop codons. We used different methods (isomorphous replacement, anomalous diffraction and molecular replacement) to determine this structure, but all failed until we straightforwardly succeeded thanks to both AlphaFold and RoseTTAFold models. Here, we describe how these new models helped us to solve this structure and conclude that in our case the AlphaFold model largely outcompetes the other models. We also discuss the importance of search-model generation for successful molecular replacement.
深度学习程序(如 AlphaFold 和 RoseTTAFold)在蛋白质结构预测方面的最新突破,必将在未来几十年彻底改变生物学领域。科学界才刚刚开始认识到这些蛋白质模型的各种应用、好处和局限性。然而,在经历了这场革命带来的最初兴奋之后,评估所提出的模型及其整体质量以避免生物学家对这些模型的错误解释或过度解释就显得尤为重要。这些模型的首批应用之一是解决 X 射线晶体学中在衍射数据计算电子密度图时遇到的“相问题”。实际上,最常用于导出电子密度图的技术是分子置换法。由于该技术依赖于与研究蛋白具有强烈结构相似性的蛋白结构知识,因此高精度模型的可用性对成功解决结构问题至关重要。在收集了 2.45 Å分辨率数据集后,我们花了两年时间试图解决一种参与无意义介导的 mRNA 衰变途径的蛋白的晶体结构,该途径是一种专门用于消除含有过早终止密码子的真核 mRNA 的 mRNA 质量控制途径。我们使用了不同的方法(同晶置换、反常衍射和分子置换)来确定该结构,但都失败了,直到我们直接使用了 AlphaFold 和 RoseTTAFold 模型才成功。在这里,我们描述了这些新模型如何帮助我们解决了这个结构问题,并得出结论,在我们的案例中,AlphaFold 模型大大优于其他模型。我们还讨论了成功的分子置换中搜索模型生成的重要性。