Innovative Institute of Chinese Medicine and Pharmacy, Chengdu University of Traditional Chinese Medicine, Chengdu 611730, China.
School of Sciences, and Center for Genomics and Computational Biology, North China University of Science and Technology, Tangshan 063000, China.
Curr Drug Metab. 2019;20(3):224-228. doi: 10.2174/1389200219666181031105916.
As molecular chaperones, Heat Shock Proteins (HSPs) not only play key roles in protein folding and maintaining protein stabilities, but are also linked with multiple kinds of diseases. Therefore, HSPs have been regarded as the focus of drug design. Since HSPs from different families play distinct functions, accurately classifying the families of HSPs is the key step to clearly understand their biological functions. In contrast to laborintensive and cost-ineffective experimental methods, computational classification of HSP families has emerged to be an alternative approach.
We reviewed the paper that described the existing datasets of HSPs and the representative computational approaches developed for the identification and classification of HSPs.
The two benchmark datasets of HSPs, namely HSPIR and sHSPdb were introduced, which provided invaluable resources for computationally identifying HSPs. The gold standard dataset and sequence encoding schemes for building computational methods of classifying HSPs were also introduced. The three representative web-servers for identifying HSPs and their families were described.
The existing machine learning methods for identifying the different families of HSPs indeed yielded quite encouraging results and did play a role in promoting the research on HSPs. However, the number of HSPs with known structures is very limited. Therefore, determining the structure of the HSPs is also urgent, which will be helpful in revealing their functions.
热休克蛋白(HSPs)作为分子伴侣,不仅在蛋白质折叠和维持蛋白质稳定性方面发挥着关键作用,还与多种疾病有关。因此,HSPs 已成为药物设计的焦点。由于不同家族的 HSPs 发挥着不同的功能,因此准确地对 HSPs 家族进行分类是清晰了解其生物学功能的关键步骤。与劳动密集型和成本效益低的实验方法相比,计算分类 HSP 家族已成为一种替代方法。
我们回顾了描述 HSPs 现有数据集和为识别和分类 HSPs 而开发的代表性计算方法的论文。
介绍了 HSPIR 和 sHSPdb 两个 HSPs 的基准数据集,为计算识别 HSPs 提供了宝贵的资源。还介绍了用于构建 HSP 分类计算方法的黄金标准数据集和序列编码方案。描述了三个用于识别 HSPs 及其家族的代表性网络服务器。
现有的用于识别不同 HSPs 家族的机器学习方法确实取得了令人鼓舞的结果,并在促进 HSPs 研究方面发挥了作用。然而,具有已知结构的 HSPs 的数量非常有限。因此,确定 HSPs 的结构也很紧迫,这将有助于揭示它们的功能。