Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley, California 94720, United States.
Molecular Medicine Program, Hospital for Sick Children, Toronto, Ontario M5S 1A8, Canada.
J Chem Theory Comput. 2023 Jul 25;19(14):4689-4700. doi: 10.1021/acs.jctc.2c01270. Epub 2023 Feb 7.
We consider a generic representation problem of internal coordinates (bond lengths, valence angles, and dihedral angles) and their transformation to 3-dimensional Cartesian coordinates of a biomolecule. We show that the internal-to-Cartesian process relies on correctly predicting chemically subtle correlations among the internal coordinates themselves, and learning these correlations increases the fidelity of the Cartesian representation. We developed a machine learning algorithm, Int2Cart, to predict bond lengths and bond angles from backbone torsion angles and residue types of a protein, which allows reconstruction of protein structures better than using fixed bond lengths and bond angles or a static library method that relies on backbone torsion angles and residue types in a local environment. The method is able to be used for structure validation, as we show that the agreement between Int2Cart-predicted bond geometries and those from an AlphaFold 2 model can be used to estimate model quality. Additionally, by using Int2Cart to reconstruct an IDP ensemble, we are able to decrease the clash rate during modeling. The Int2Cart algorithm has been implemented as a publicly accessible python package at https://github.com/THGLab/int2cart.
我们考虑了一个内部坐标(键长、价角度和二面角)及其转换为生物分子的三维笛卡尔坐标的通用表示问题。我们表明,内部到笛卡尔的过程依赖于正确预测内部坐标之间的化学细微相关性,并且学习这些相关性可以提高笛卡尔表示的保真度。我们开发了一种机器学习算法 Int2Cart,用于从蛋白质的骨架扭转角和残基类型预测键长和键角,这使得蛋白质结构的重建比使用固定的键长和键角或依赖于局部环境中骨架扭转角和残基类型的静态库方法更好。该方法可用于结构验证,因为我们表明,Int2Cart 预测的键几何形状与 AlphaFold 2 模型的那些之间的一致性可用于估计模型质量。此外,通过使用 Int2Cart 重建 IDP 集合,我们能够降低建模过程中的冲突率。Int2Cart 算法已在 https://github.com/THGLab/int2cart 上实现为一个可公开访问的 Python 包。