Chou K C, Tomasselli A G, Reardon I M, Heinrikson R L
Pharmacia & Upjohn Laboratories, Kalamazoo, Michigan 49001-4940, USA.
Proteins. 1996 Jan;24(1):51-72. doi: 10.1002/(SICI)1097-0134(199601)24:1<51::AID-PROT4>3.0.CO;2-R.
Based on the sequence-coupled (Markov chain) model and vector-projection principle, a discriminant function method is proposed to predict sites in protein substrates that should be susceptible to cleavage by the HIV-1 protease. The discriminant function is defined by delta = phi+ - phi-, where phi+ and phi- are the cleavable and noncleavable attributes for a given peptide, and they can be derived from two complementary sets of peptides, S+ and S-, known to be cleavable and noncleavable, respectively, by the enzyme. The rate of correct prediction by the method for the 62 cleavable peptides and 239 noncleavable peptides in the training set are 100 and 96.7%, respectively. Application of the method to the 55 sequences which are outside the training set and known to be cleaved by the HIV-1 protease accurately predicted 100% of the peptides as substrates of the enzyme. The method also predicted all but one of the sites hydrolyzed by the protease in native HIV-1 and HIV-2 reverse transcriptases, where the HIV-1 protease discriminates between nearly identical sequences in a very subtle fashion. Finally, the algorithm predicts correctly all of the HIV-1 protease processing sites in the native gag and gag/pol HIV-1 polyproteins, and all of the cleavage sites identified in denatured protease and reverse transcriptase. The new predictive algorithm provides a novel route toward understanding the specificity of this important therapeutic target.
基于序列耦合(马尔可夫链)模型和向量投影原理,提出了一种判别函数方法,用于预测蛋白质底物中可能易被HIV-1蛋白酶切割的位点。判别函数定义为δ = φ⁺ - φ⁻,其中φ⁺和φ⁻分别是给定肽段的可切割和不可切割属性,它们可从两组互补的肽段S⁺和S⁻推导得出,已知这两组肽段分别可被该酶切割和不可被该酶切割。对于训练集中的62个可切割肽段和239个不可切割肽段,该方法的正确预测率分别为100%和96.7%。将该方法应用于训练集之外且已知可被HIV-1蛋白酶切割的55个序列,准确预测出100%的肽段为该酶的底物。该方法还预测出了天然HIV-1和HIV-2逆转录酶中除一个位点外的所有被蛋白酶水解的位点,其中HIV-1蛋白酶以非常微妙的方式区分几乎相同的序列。最后,该算法正确预测出了天然HIV-1 gag和gag/pol多聚蛋白中所有的HIV-1蛋白酶加工位点,以及变性蛋白酶和逆转录酶中鉴定出的所有切割位点。这种新的预测算法为理解这一重要治疗靶点的特异性提供了一条新途径。