Kabir Md Wasi Ul, Alawad Duaa Mohammad, Mishra Avdesh, Hoque Md Tamjidul
Computer Science Department, University of New Orleans, New Orleans, LA 70148, USA.
Department of Electrical Engineering and Computer Science, Texas A&M University-Kingsville, Kingsville, TX 78363, USA.
Biology (Basel). 2023 Jul 19;12(7):1020. doi: 10.3390/biology12071020.
Protein molecules show varying degrees of flexibility throughout their three-dimensional structures. The flexibility is determined by the fluctuations in torsion angles, specifically phi (φ) and psi (ψ), which define the protein backbone. These angle fluctuations are derived from variations in backbone torsion angles observed in different models. By analyzing the fluctuations in Cartesian coordinate space, we can understand the structural flexibility of proteins. Predicting torsion angle fluctuations is valuable for determining protein function and structure when these angles act as constraints. In this study, a machine learning method called TAFPred is developed to predict torsion angle fluctuations using protein sequences directly. The method incorporates various features, such as disorder probability, position-specific scoring matrix profiles, secondary structure probabilities, and more. TAFPred, employing an optimized Light Gradient Boosting Machine Regressor (LightGBM), achieved high accuracy with correlation coefficients of 0.746 and 0.737 and mean absolute errors of 0.114 and 0.123 for the φ and ψ angles, respectively. Compared to the state-of-the-art method, TAFPred demonstrated significant improvements of 10.08% in MAE and 24.83% in PCC for the phi angle and 9.93% in MAE, and 22.37% in PCC for the psi angle.
蛋白质分子在其三维结构中表现出不同程度的灵活性。这种灵活性由扭转角的波动决定,特别是定义蛋白质主链的φ(φ)和ψ(ψ)角。这些角度波动源自不同模型中观察到的主链扭转角变化。通过分析笛卡尔坐标空间中的波动,我们可以了解蛋白质的结构灵活性。当这些角度作为约束条件时,预测扭转角波动对于确定蛋白质的功能和结构很有价值。在本研究中,开发了一种名为TAFPred的机器学习方法,直接使用蛋白质序列预测扭转角波动。该方法纳入了各种特征,如无序概率、位置特异性评分矩阵概况、二级结构概率等。TAFPred采用优化的轻梯度提升机回归器(LightGBM),对于φ角和ψ角,相关系数分别为0.746和0.737,平均绝对误差分别为0.114和0.123,实现了高精度。与现有最先进方法相比,TAFPred在φ角的平均绝对误差(MAE)上有10.08%的显著提高,在皮尔逊相关系数(PCC)上有24.83%的显著提高;在ψ角的MAE上有9.93%的显著提高,在PCC上有22.37%的显著提高。