Nielsen H, Brunak S, von Heijne G
Center for Biological Sequence Analysis, Department of Biotechnology, The Technical University of Denmark, Lyngby.
Protein Eng. 1999 Jan;12(1):3-9. doi: 10.1093/protein/12.1.3.
Prediction of protein sorting signals from the sequence of amino acids has great importance in the field of proteomics today. Recently, the growth of protein databases, combined with machine learning approaches, such as neural networks and hidden Markov models, have made it possible to achieve a level of reliability where practical use in, for example automatic database annotation is feasible. In this review, we concentrate on the present status and future perspectives of SignalP, our neural network-based method for prediction of the most well-known sorting signal: the secretory signal peptide. We discuss the problems associated with the use of SignalP on genomic sequences, showing that signal peptide prediction will improve further if integrated with predictions of start codons and transmembrane helices. As a step towards this goal, a hidden Markov model version of SignalP has been developed, making it possible to discriminate between cleaved signal peptides and uncleaved signal anchors. Furthermore, we show how SignalP can be used to characterize putative signal peptides from an archaeon, Methanococcus jannaschii. Finally, we briefly review a few methods for predicting other protein sorting signals and discuss the future of protein sorting prediction in general.
从氨基酸序列预测蛋白质分选信号在当今蛋白质组学领域具有重要意义。近来,蛋白质数据库的增长,结合神经网络和隐马尔可夫模型等机器学习方法,已使得达到一定的可靠性水平成为可能,在此水平下,例如在自动数据库注释中的实际应用变得可行。在本综述中,我们聚焦于SignalP的现状与未来展望,SignalP是我们基于神经网络的用于预测最著名的分选信号——分泌信号肽的方法。我们讨论了在基因组序列上使用SignalP所涉及的问题,表明如果与起始密码子和跨膜螺旋的预测相结合,信号肽预测将得到进一步改善。作为朝着这一目标迈出的一步,已开发出SignalP的隐马尔可夫模型版本,使得区分已切割的信号肽和未切割的信号锚成为可能。此外,我们展示了SignalP如何用于表征来自嗜压甲烷球菌这一古生菌的假定信号肽。最后,我们简要回顾了一些预测其他蛋白质分选信号的方法,并总体讨论了蛋白质分选预测的未来。