College of Life Science, Qingdao University, Qingdao 266071, China.
College of Life Science, Qingdao University, Qingdao 266071, China.
Methods. 2023 Oct;218:141-148. doi: 10.1016/j.ymeth.2023.08.012. Epub 2023 Aug 19.
The demand for thermophilic protein has been increasing in protein engineering recently. Many machine-learning methods for identifying thermophilic proteins have emerged during this period. However, most machine learning-based thermophilic protein identification studies have only focused on accuracy. The relationship between the features' meaning and the proteins' physicochemical properties has yet to be studied in depth. In this article, we focused on the relationship between the features and the thermal stability of thermophilic proteins. This method used 2-D general series correlation pseudo amino acid (SC-PseAAC-General) features and realized accuracy of 82.76% using the J48 classifier. In addition, this research found the presence of higher frequencies of glutamic acid in thermophilic proteins, which help thermophilic proteins maintain their thermal stability by forming hydrogen bonds and salt bridges that prevent denaturation at high temperatures.
近年来,蛋白质工程对嗜热蛋白的需求不断增加。在此期间,出现了许多用于鉴定嗜热蛋白的机器学习方法。然而,大多数基于机器学习的嗜热蛋白鉴定研究仅关注准确性。特征的含义与蛋白质理化性质之间的关系尚未得到深入研究。在本文中,我们专注于特征与嗜热蛋白热稳定性之间的关系。该方法使用二维通用序列相关伪氨基酸(SC-PseAAC-General)特征,并使用 J48 分类器实现了 82.76%的准确率。此外,本研究发现嗜热蛋白中谷氨酸的出现频率更高,通过形成氢键和盐桥来帮助嗜热蛋白保持其热稳定性,从而防止在高温下变性。