Institute of Biomedical Chemistry, 119121 Moscow, Russia.
Int J Mol Sci. 2023 Aug 30;24(17):13431. doi: 10.3390/ijms241713431.
Amino acid substitutions and post-translational modifications (PTMs) play a crucial role in many cellular processes by directly affecting the structural and dynamic features of protein interaction. Despite their importance, the understanding of protein PTMs at the structural level is still largely incomplete. The Protein Data Bank contains a relatively small number of 3D structures having post-translational modifications. Although recent years have witnessed significant progress in three-dimensional modeling (3D) of proteins using neural networks, the problem related to predicting accurate PTMs in proteins has been largely ignored. Predicting accurate 3D PTM models in proteins is closely related to another fundamental problem: predicting the correct side-chain conformations of amino acid residues in proteins. An analysis of publications as well as the paid and free software packages for modeling three-dimensional structures showed that most of them focus on working with unmodified proteins and canonical amino acid residues; the number of articles and software packages placing emphasis on modeling three-dimensional PTM structures is an order of magnitude smaller. This paper focuses on modeling the side-chain conformations of proteins containing PTMs (nonstandard amino acid residues). We collected our own libraries comprising the most frequently observed PTMs from the PDB and implemented a number of algorithms for predicting the side-chain conformation at modification points and in the immediate environment of the protein. A comprehensive analysis of both the algorithms per se and compared to the common Rosetta and FoldX structure modeling packages was also carried out. The proposed algorithmic solutions are comparable in their characteristics to the well-known Rosetta and FoldX packages for the modeling of three-dimensional structures and have great potential for further development and optimization. The source code of algorithmic solutions has been deposited to and is available at the GitHub source.
氨基酸取代和翻译后修饰 (PTMs) 通过直接影响蛋白质相互作用的结构和动态特征,在许多细胞过程中发挥着关键作用。尽管它们很重要,但在结构水平上对蛋白质 PTM 的理解仍然很大程度上不完整。蛋白质数据库包含相对较少的具有翻译后修饰的 3D 结构。尽管近年来使用神经网络在蛋白质的三维建模 (3D) 方面取得了重大进展,但与预测蛋白质中准确 PTM 相关的问题在很大程度上被忽视了。预测蛋白质中准确的 3D PTM 模型与另一个基本问题密切相关:预测蛋白质中氨基酸残基的正确侧链构象。对出版物以及用于建模三维结构的付费和免费软件包的分析表明,它们大多数都侧重于处理未修饰的蛋白质和规范的氨基酸残基;强调建模三维 PTM 结构的文章和软件包的数量要小一个数量级。本文专注于建模含有 PTM(非标准氨基酸残基)的蛋白质的侧链构象。我们收集了自己的库,其中包含来自 PDB 的最常见的 PTM,并实现了许多用于预测修饰点和蛋白质紧邻环境中侧链构象的算法。还对算法本身及其与常见的 Rosetta 和 FoldX 结构建模包进行了综合分析。所提出的算法解决方案在其特性上与著名的 Rosetta 和 FoldX 三维结构建模包相当,具有进一步开发和优化的巨大潜力。算法解决方案的源代码已被存入并可在 GitHub 源中获得。