Department of Computer Science, Tennessee State University, Nashville, TN 37209, USA.
Department of Mechanical Engineering, Middle East Technical University, Ankara 06800, Türkiye.
Biomolecules. 2023 May 31;13(6):923. doi: 10.3390/biom13060923.
Determining Secondary Structure Elements (SSEs) for any protein is crucial as an intermediate step for experimental tertiary structure determination. SSEs are identified using popular tools such as DSSP and STRIDE. These tools use atomic information to locate hydrogen bonds to identify SSEs. When some spatial atomic details are missing, locating SSEs becomes a hinder. To address the problem, when some atomic information is missing, three approaches for classifying SSE types using Cα atoms in protein chains were developed: (1) a mathematical approach, (2) a deep learning approach, and (3) an ensemble of five machine learning models. The proposed methods were compared against each other and with a state-of-the-art approach, PCASSO.
确定任何蛋白质的二级结构元件 (SSEs) 是实验确定三级结构的中间步骤,至关重要。SSEs 使用 DSSP 和 STRIDE 等流行工具来识别。这些工具使用原子信息来定位氢键以识别 SSEs。当某些空间原子细节缺失时,定位 SSEs 就成了障碍。为了解决这个问题,当某些原子信息缺失时,开发了三种使用蛋白质链中的 Cα 原子对 SSE 类型进行分类的方法:(1) 数学方法,(2) 深度学习方法,(3) 五个机器学习模型的集成。将提出的方法相互比较,并与最先进的方法 PCASSO 进行了比较。