Institute for Molecular Bioscience, The University of Queensland, Brisbane, Australia.
Bioinformatics. 2011 May 1;27(9):1239-46. doi: 10.1093/bioinformatics/btr121. Epub 2011 Mar 3.
Nucleo-cytoplasmic trafficking of proteins is a core regulatory process that sustains the integrity of the nuclear space of eukaryotic cells via an interplay between numerous factors. Despite progress on experimentally characterizing a number of nuclear localization signals, their presence alone remains an unreliable indicator of actual translocation.
This article introduces a probabilistic model that explicitly recognizes a variety of nuclear localization signals, and integrates relevant amino acid sequence and interaction data for any candidate nuclear protein. In particular, we develop and incorporate scoring functions based on distinct classes of classical nuclear localization signals. Our empirical results show that the model accurately predicts whether a protein is imported into the nucleus, surpassing the classification accuracy of similar predictors when evaluated on the mouse and yeast proteomes (area under the receiver operator characteristic curve of 0.84 and 0.80, respectively). The model also predicts the sequence position of a nuclear localization signal and whether it interacts with importin-α.
蛋白质的核质运输是一种核心调节过程,通过众多因素的相互作用,维持真核细胞核空间的完整性。尽管在实验上对许多核定位信号进行了特征描述,但仅存在核定位信号本身并不能可靠地指示实际的转位。
本文介绍了一种概率模型,该模型明确识别了各种核定位信号,并整合了任何候选核蛋白的相关氨基酸序列和相互作用数据。特别是,我们开发并整合了基于不同类别的经典核定位信号的评分函数。我们的实验结果表明,该模型能够准确预测蛋白质是否被导入细胞核,在评估小鼠和酵母蛋白质组时,其分类准确性超过了类似预测器(接收器操作特征曲线下的面积分别为 0.84 和 0.80)。该模型还可以预测核定位信号的序列位置及其是否与 importin-α 相互作用。