Department of Computer Science, University of Illinois at Chicago, Chicago, Illinois, USA.
Department of Chemistry and Biochemistry, Messiah University, Mechanicsburg, Pennsylvania, USA.
Protein Sci. 2021 Jun;30(6):1247-1257. doi: 10.1002/pro.4074. Epub 2021 Apr 20.
Metal cofactors within proteins perform a versatile set of essential cellular functions. In order to take advantage of the diverse functionality of metalloproteins, researchers have been working to design or modify metal binding sites in proteins to rationally tune the function or activity of the metal cofactor. This study has performed an analysis on the backbone atom geometries of metal-binding amino acids among 10 different metal binding sites within the entire protein data bank. A set of 13 geometric parameters (features) was identified that is capable of predicting the presence of a metal cofactor in the protein structure with overall accuracies of up to 97% given only the relative positions of their backbone atoms. The decision tree machine-learning algorithm used can quickly analyze an entire protein structure for the presence of sets of primary metal coordination spheres upon mutagenesis, independent of their original amino acid identities. The methodology was designed for application in the field of metalloprotein engineering. A cluster analysis using the data set was also performed and demonstrated that the features chosen are useful for identifying clusters of structurally similar metal-binding sites.
金属辅因子在蛋白质中执行着多种多样的基本细胞功能。为了充分利用金属蛋白的多样化功能,研究人员一直在努力设计或修饰蛋白质中的金属结合位点,以合理调节金属辅因子的功能或活性。本研究对整个蛋白质数据库中 10 个不同金属结合位点中金属结合氨基酸的骨架原子几何形状进行了分析。确定了一组 13 个几何参数(特征),仅根据其骨架原子的相对位置,就能以高达 97%的整体准确度预测蛋白质结构中金属辅因子的存在。所使用的决策树机器学习算法可以快速分析整个蛋白质结构中存在的一组主要金属配位球,而无需考虑其原始氨基酸的身份。该方法旨在应用于金属蛋白工程领域。还对数据集进行了聚类分析,结果表明所选择的特征对于识别结构相似的金属结合位点簇非常有用。