Department of Chemistry and Chemical Biology, Institute for Quantitative Biomedicine, Rutgers University, Piscataway, NJ, 08854, USA.
Department of Statistics, Institute for Quantitative Biomedicine, Rutgers University, Piscataway, NJ, 08854, USA.
Sci Rep. 2022 Jan 10;12(1):469. doi: 10.1038/s41598-021-04306-4.
Short hydrogen bonds (SHBs), whose donor and acceptor heteroatoms lie within 2.7 Å, exhibit prominent quantum mechanical characters and are connected to a wide range of essential biomolecular processes. However, exact determination of the geometry and functional roles of SHBs requires a protein to be at atomic resolution. In this work, we analyze 1260 high-resolution peptide and protein structures from the Protein Data Bank and develop a boosting based machine learning model to predict the formation of SHBs between amino acids. This model, which we name as machine learning assisted prediction of short hydrogen bonds (MAPSHB), takes into account 21 structural, chemical and sequence features and their interaction effects and effectively categorizes each hydrogen bond in a protein to a short or normal hydrogen bond. The MAPSHB model reveals that the type of the donor amino acid plays a major role in determining the class of a hydrogen bond and that the side chain Tyr-Asp pair demonstrates a significant probability of forming a SHB. Combining electronic structure calculations and energy decomposition analysis, we elucidate how the interplay of competing intermolecular interactions stabilizes the Tyr-Asp SHBs more than other commonly observed combinations of amino acid side chains. The MAPSHB model, which is freely available on our web server, allows one to accurately and efficiently predict the presence of SHBs given a protein structure with moderate or low resolution and will facilitate the experimental and computational refinement of protein structures.
短氢键(SHBs)的供体和受体杂原子之间的距离在 2.7Å 以内,表现出显著的量子力学特征,并与广泛的重要生物分子过程相关。然而,要准确确定 SHBs 的几何形状和功能作用,需要将蛋白质解析到原子分辨率。在这项工作中,我们分析了来自蛋白质数据库(PDB)的 1260 个高分辨率肽和蛋白质结构,并开发了一种基于提升的机器学习模型来预测氨基酸之间 SHBs 的形成。我们将这个模型命名为机器学习辅助预测短氢键(MAPSHB),它考虑了 21 种结构、化学和序列特征及其相互作用效应,并有效地将蛋白质中的每个氢键分类为短氢键或正常氢键。MAPSHB 模型表明,供体氨基酸的类型在决定氢键的类别方面起着主要作用,并且 Tyr-Asp 对显示出形成 SHB 的显著概率。通过结合电子结构计算和能量分解分析,我们阐明了竞争的分子间相互作用如何稳定 Tyr-Asp SHBs,使其超过其他常见的氨基酸侧链组合。MAPSHB 模型可在我们的网络服务器上免费获得,它允许根据中等或低分辨率的蛋白质结构准确且高效地预测 SHBs 的存在,并将促进蛋白质结构的实验和计算细化。