Mintz-Oron Shira, Aharoni Asaph, Ruppin Eytan, Shlomi Tomer
Department of Plant Sciences, Weizmann Institute of Science, Rehovot, Israel.
Bioinformatics. 2009 Jun 15;25(12):i247-52. doi: 10.1093/bioinformatics/btp209.
Revealing the subcellular localization of proteins within membrane-bound compartments is of a major importance for inferring protein function. Though current high-throughput localization experiments provide valuable data, they are costly and time-consuming, and due to technical difficulties not readily applicable for many Eukaryotes. Physical characteristics of proteins, such as sequence targeting signals and amino acid composition are commonly used to predict subcellular localizations using computational approaches. Recently it was shown that protein-protein interaction (PPI) networks can be used to significantly improve the prediction accuracy of protein subcellular localization. However, as high-throughput PPI data depend on costly high-throughput experiments and are currently available for only a few organisms, the scope of such methods is yet limited.
This study presents a novel constraint-based method for predicting subcellular localization of enzymes based on their embedding metabolic network, relying on a parsimony principle of a minimal number of cross-membrane metabolite transporters. In a cross-validation test of predicting known subcellular localization of yeast enzymes, the method is shown to be markedly robust, providing accurate localization predictions even when only 20% of the known enzyme localizations are given as input. It is shown to outperform pathway enrichment-based methods both in terms of prediction accuracy and in its ability to predict the subcellular localization of entire metabolic pathways when no a-priori pathway-specific localization data is available (and hence enrichment methods are bound to fail). With the number of available metabolic networks already reaching more than 600 and growing fast, the new method may significantly contribute to the identification of enzyme localizations in many different organisms.
揭示膜结合区室中蛋白质的亚细胞定位对于推断蛋白质功能至关重要。尽管当前的高通量定位实验提供了有价值的数据,但它们成本高昂且耗时,并且由于技术困难,许多真核生物难以适用。蛋白质的物理特性,如序列靶向信号和氨基酸组成,通常用于通过计算方法预测亚细胞定位。最近的研究表明,蛋白质-蛋白质相互作用(PPI)网络可用于显著提高蛋白质亚细胞定位的预测准确性。然而,由于高通量PPI数据依赖于成本高昂的高通量实验,目前仅适用于少数生物,此类方法的应用范围仍然有限。
本研究提出了一种基于嵌入代谢网络的新型约束方法,用于预测酶的亚细胞定位,该方法依赖于跨膜代谢物转运体数量最少的简约原则。在预测酵母酶已知亚细胞定位的交叉验证测试中,该方法显示出显著的稳健性,即使仅将20%的已知酶定位作为输入,也能提供准确的定位预测。在预测准确性以及在没有先验途径特异性定位数据(因此富集方法必然失败)时预测整个代谢途径亚细胞定位的能力方面,该方法均优于基于途径富集的方法。随着可用代谢网络数量已达600多个且增长迅速,新方法可能会对许多不同生物中酶定位的鉴定做出重大贡献。