Information Engineering and Computer Science Department, via Sommarive 14 - I38100 (Povo) Trento, Italy.
BMC Bioinformatics. 2010 Mar 3;11:115. doi: 10.1186/1471-2105-11-115.
Prediction of catalytic residues is a major step in characterizing the function of enzymes. In its simpler formulation, the problem can be cast into a binary classification task at the residue level, by predicting whether the residue is directly involved in the catalytic process. The task is quite hard also when structural information is available, due to the rather wide range of roles a functional residue can play and to the large imbalance between the number of catalytic and non-catalytic residues.
We developed an effective representation of structural information by modeling spherical regions around candidate residues, and extracting statistics on the properties of their content such as physico-chemical properties, atomic density, flexibility, presence of water molecules. We trained an SVM classifier combining our features with sequence-based information and previously developed 3D features, and compared its performance with the most recent state-of-the-art approaches on different benchmark datasets. We further analyzed the discriminant power of the information provided by the presence of heterogens in the residue neighborhood.
Our structure-based method achieves consistent improvements on all tested datasets over both sequence-based and structure-based state-of-the-art approaches. Structural neighborhood information is shown to be responsible for such results, and predicting the presence of nearby heterogens seems to be a promising direction for further improvements.
催化残基的预测是表征酶功能的主要步骤。在更简单的形式中,该问题可以在残基水平上被归结为二元分类任务,通过预测残基是否直接参与催化过程。即使有结构信息可用,由于功能残基可以发挥的作用范围很广,以及催化残基和非催化残基数量之间的巨大不平衡,该任务也相当困难。
我们通过对候选残基周围的球形区域建模,并提取其内容的属性(如理化性质、原子密度、柔韧性、水分子存在情况)的统计信息,开发了一种有效的结构信息表示方法。我们训练了一个 SVM 分类器,将我们的特征与基于序列的信息和以前开发的 3D 特征相结合,并将其性能与不同基准数据集上最新的最先进方法进行了比较。我们进一步分析了残基邻域中杂原子存在提供的信息的判别能力。
我们的基于结构的方法在所有测试数据集上都优于基于序列和基于结构的最先进方法,实现了一致的改进。结构邻域信息是导致这些结果的原因,并且预测附近杂原子的存在似乎是进一步改进的有前途的方向。