Sanchez Ricardo, Riddle Megan, Woo Jongwook, Momand Jamil
Department of Chemistry and Biochemistry, California State University, Los Angeles, California 90032, USA.
Protein Sci. 2008 Mar;17(3):473-81. doi: 10.1110/ps.073252408.
Protein cysteine thiols can be divided into four groups based on their reactivities: those that form permanent structural disulfide bonds, those that coordinate with metals, those that remain in the reduced state, and those that are susceptible to reversible oxidation. Physicochemical parameters of oxidation-susceptible protein thiols were organized into a database named the Balanced Oxidation Susceptible Cysteine Thiol Database (BALOSCTdb). BALOSCTdb contains 161 cysteine thiols that undergo reversible oxidation and 161 cysteine thiols that are not susceptible to oxidation. Each cysteine was represented by a set of 12 parameters, one of which was a label (1/0) to indicate whether its thiol moiety is susceptible to oxidation. A computer program (the C4.5 decision tree classifier re-implemented as the J48 classifier) segregated cysteines into oxidation-susceptible and oxidation-non-susceptible classes. The classifier selected three parameters critical for prediction of thiol oxidation susceptibility: (1) distance to the nearest cysteine sulfur atom, (2) solvent accessibility, and (3) pKa. The classifier was optimized to correctly predict 136 of the 161 cysteine thiols susceptible to oxidation. Leave-one-out cross-validation analysis showed that the percent of correctly classified cysteines was 80.1% and that 16.1% of the oxidation-susceptible cysteine thiols were incorrectly classified. The algorithm developed from these parameters, named the Cysteine Oxidation Prediction Algorithm (COPA), is presented here. COPA prediction of oxidation-susceptible sites can be utilized to locate protein cysteines susceptible to redox-mediated regulation and identify possible enzyme catalytic sites with reactive cysteine thiols.
形成永久性结构二硫键的、与金属配位的、保持还原态的以及易发生可逆氧化的。将易氧化蛋白质硫醇的物理化学参数整理到一个名为平衡氧化敏感半胱氨酸硫醇数据库(BALOSCTdb)中。BALOSCTdb包含161个发生可逆氧化的半胱氨酸硫醇和161个不易氧化的半胱氨酸硫醇。每个半胱氨酸由一组12个参数表示,其中一个是标签(1/0),用于指示其硫醇部分是否易氧化。一个计算机程序(重新实现为J48分类器的C4.5决策树分类器)将半胱氨酸分为易氧化和不易氧化两类。该分类器选择了三个对预测硫醇氧化敏感性至关重要的参数:(1)到最近半胱氨酸硫原子的距离,(2)溶剂可及性,以及(3)pKa。该分类器经过优化,能够正确预测161个易氧化半胱氨酸硫醇中的136个。留一法交叉验证分析表明,正确分类的半胱氨酸百分比为80.1%,16.1%的易氧化半胱氨酸硫醇被错误分类。本文介绍了根据这些参数开发的算法,即半胱氨酸氧化预测算法(COPA)。COPA对氧化敏感位点的预测可用于定位易受氧化还原介导调控的蛋白质半胱氨酸,并识别具有反应性半胱氨酸硫醇的可能酶催化位点。