Di Grazia Luca, Aminpour Maral, Vezzetti Enrico, Rezania Vahid, Marcolin Federica, Tuszynski Jack Adam
DIGEP, Politecnico di Torino, Torino, Italy.
Department of Physics, University of Alberta, Edmonton, Alberta, Canada.
Proteins. 2020 Aug 11:e25993. doi: 10.1002/prot.25993.
This article reports on the results of research aimed to translate biometric 3D face recognition concepts and algorithms into the field of protein biophysics in order to precisely and rapidly classify morphological features of protein surfaces. Both human faces and protein surfaces are free-forms and some descriptors used in differential geometry can be used to describe them applying the principles of feature extraction developed for computer vision and pattern recognition. The first part of this study focused on building the protein dataset using a simulation tool and performing feature extraction using novel geometrical descriptors. The second part tested the method on two examples, first involved a classification of tubulin isotypes and the second compared tubulin with the FtsZ protein, which is its bacterial analog. An additional test involved several unrelated proteins. Different classification methodologies have been used: a classic approach with a support vector machine (SVM) classifier and an unsupervised learning with a k-means approach. The best result was obtained with SVM and the radial basis function kernel. The results are significant and competitive with the state-of-the-art protein classification methods. This leads to a new methodological direction in protein structure analysis.
本文报道了一项研究成果,该研究旨在将生物特征三维人脸识别概念和算法应用于蛋白质生物物理学领域,以便精确、快速地对蛋白质表面的形态特征进行分类。人脸和蛋白质表面都是自由形式,微分几何中使用的一些描述符可用于描述它们,这是应用了为计算机视觉和模式识别开发的特征提取原理。本研究的第一部分重点是使用模拟工具构建蛋白质数据集,并使用新颖的几何描述符进行特征提取。第二部分在两个例子上测试了该方法,第一个例子涉及微管蛋白异构体的分类,第二个例子将微管蛋白与其细菌类似物FtsZ蛋白进行比较。另外一个测试涉及几种不相关的蛋白质。使用了不同的分类方法:一种是使用支持向量机(SVM)分类器的经典方法,另一种是使用k均值方法的无监督学习。使用SVM和径向基函数核获得了最佳结果。这些结果意义重大,与当前最先进的蛋白质分类方法相比具有竞争力。这为蛋白质结构分析带来了一个新的方法学方向。