Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan.
Bioinformatics. 2017 Jul 1;33(13):1937-1943. doi: 10.1093/bioinformatics/btx112.
Functional prediction of paralogs is challenging in bioinformatics because of rapid functional diversification after gene duplication events combined with parallel acquisitions of similar functions by different paralogs. Plant type III polyketide synthases (PKSs), producing various secondary metabolites, represent a paralogous family that has undergone gene duplication and functional alteration. Currently, there is no computational method available for the functional prediction of type III PKSs.
We developed a plant type III PKS reaction predictor, pPAP, based on the recently proposed classification of type III PKSs. pPAP combines two kinds of similarity measures: one calculated by profile hidden Markov models (pHMMs) built from functionally and structurally important partial sequence regions, and the other based on mutual information between residue positions. pPAP targets PKSs acting on ring-type starter substrates, and classifies their functions into four reaction types. The pHMM approach discriminated two reaction types with high accuracy (97.5%, 39/40), but its accuracy decreased when discriminating three reaction types (87.8%, 43/49). When combined with a correlation-based approach, all 49 PKSs were correctly discriminated, and pPAP was still highly accurate (91.4%, 64/70) even after adding other reaction types. These results suggest pPAP, which is based on linear discriminant analyses of similarity measures, is effective for plant type III PKS function prediction.
pPAP is freely available at ftp://ftp.genome.jp/pub/tools/ppap/.
Supplementary data are available at Bioinformatics online.
在生物信息学中,由于基因复制事件后功能快速多样化,以及不同的旁系同源物同时获得相似的功能,旁系同源物的功能预测具有挑战性。植物 III 型聚酮合酶(PKSs)产生各种次生代谢物,代表了一个经历基因复制和功能改变的旁系同源家族。目前,还没有用于 III 型 PKS 功能预测的计算方法。
我们基于最近提出的 III 型 PKS 分类,开发了一种植物 III 型 PKS 反应预测器 pPAP。pPAP 结合了两种相似性度量:一种由功能和结构重要的部分序列区域构建的轮廓隐马尔可夫模型(pHMM)计算,另一种基于残基位置之间的互信息。pPAP 针对作用于环型起始底物的 PKS,将其功能分为四种反应类型。pHMM 方法以高精度(97.5%,40/40)区分两种反应类型,但在区分三种反应类型时准确性降低(87.8%,49/49)。当与基于相关性的方法结合时,所有 49 个 PKS 都被正确区分,即使添加其他反应类型,pPAP 的准确性仍然很高(91.4%,64/70)。这些结果表明,基于相似性度量的线性判别分析的 pPAP 对植物 III 型 PKS 功能预测是有效的。
pPAP 可在 ftp://ftp.genome.jp/pub/tools/ppap/ 免费获得。
补充数据可在 Bioinformatics 在线获得。