Chen Ching-Tai, Yang Ei-Wen, Hsu Hung-Ju, Sun Yi-Kun, Hsu Wen-Lian, Yang An-Suei
Institute of Information Science, Academia Sinica, Taipei 115, Taiwan.
Bioinformatics. 2008 Dec 1;24(23):2691-7. doi: 10.1093/bioinformatics/btn538. Epub 2008 Oct 29.
Regulatory proteases modulate proteomic dynamics with a spectrum of specificities against substrate proteins. Predictions of the substrate sites in a proteome for the proteases would facilitate understanding the biological functions of the proteases. High-throughput experiments could generate suitable datasets for machine learning to grasp complex relationships between the substrate sequences and the enzymatic specificities. But the capability in predicting protease substrate sites by integrating the machine learning algorithms with the experimental methodology has yet to be demonstrated.
Factor Xa, a key regulatory protease in the blood coagulation system, was used as model system, for which effective substrate site predictors were developed and benchmarked. The predictors were derived from bootstrap aggregation (machine learning) algorithms trained with data obtained from multilevel substrate phage display experiments. The experimental sampling and computational learning on substrate specificities can be generalized to proteases for which the active forms are available for the in vitro experiments.
调节性蛋白酶通过对底物蛋白具有一系列特异性来调节蛋白质组动力学。预测蛋白酶在蛋白质组中的底物位点将有助于理解蛋白酶的生物学功能。高通量实验可为机器学习生成合适的数据集,以掌握底物序列与酶特异性之间的复杂关系。但是,将机器学习算法与实验方法相结合来预测蛋白酶底物位点的能力尚未得到证实。
凝血系统中的关键调节蛋白酶凝血因子Xa被用作模型系统,为此开发了有效的底物位点预测器并进行了基准测试。这些预测器源自通过多级底物噬菌体展示实验获得的数据训练的自助聚合(机器学习)算法。关于底物特异性的实验采样和计算学习可以推广到其活性形式可用于体外实验的蛋白酶。