Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, 700 Children's Drive, Columbus, OH, USA.
J Mol Model. 2012 Sep;18(9):4275-89. doi: 10.1007/s00894-012-1410-7. Epub 2012 May 8.
Computational methods are rapidly gaining importance in the field of structural biology, mostly due to the explosive progress in genome sequencing projects and the large disparity between the number of sequences and the number of structures. There has been an exponential growth in the number of available protein sequences and a slower growth in the number of structures. There is therefore an urgent need to develop computational methods to predict structures and identify their functions from the sequence. Developing methods that will satisfy these needs both efficiently and accurately is of paramount importance for advances in many biomedical fields, including drug development and discovery of biomarkers. A novel method called fast learning optimized prediction methodology (FLOPRED) is proposed for predicting protein secondary structure, using knowledge-based potentials combined with structure information from the CATH database. A neural network-based extreme learning machine (ELM) and advanced particle swarm optimization (PSO) are used with this data that yield better and faster convergence to produce more accurate results. Protein secondary structures are predicted reliably, more efficiently and more accurately using FLOPRED. These techniques yield superior classification of secondary structure elements, with a training accuracy ranging between 83 % and 87 % over a widerange of hidden neurons and a cross-validated testing accuracy ranging between 81 % and 84 % and a segment overlap (SOV) score of 78 % that are obtained with different sets of proteins. These results are comparable to other recently published studies, but are obtained with greater efficiencies, in terms of time and cost.
计算方法在结构生物学领域的重要性日益增加,主要是由于基因组测序项目的飞速发展,以及序列数量和结构数量之间的巨大差距。可用蛋白质序列的数量呈指数级增长,而结构数量的增长则较慢。因此,迫切需要开发计算方法,从序列中预测结构并识别其功能。开发既高效又准确的方法对于许多生物医学领域的进展至关重要,包括药物开发和生物标志物的发现。提出了一种名为快速学习优化预测方法(FLOPRED)的新方法,用于预测蛋白质二级结构,该方法结合了基于知识的势能和 CATH 数据库中的结构信息。使用神经网络的极限学习机(ELM)和高级粒子群优化(PSO)来处理这些数据,以更好更快地收敛,从而产生更准确的结果。FLOPRED 可可靠、高效、准确地预测蛋白质二级结构。这些技术可实现二级结构元素的分类,在不同的隐藏神经元范围内,训练准确率在 83%到 87%之间,交叉验证测试准确率在 81%到 84%之间,片段重叠(SOV)分数为 78%,这些结果与其他最近发表的研究相当,但效率更高,无论是在时间还是成本方面。