Martelli Pier Luigi, Savojardo Castrense, Fariselli Piero, Tasco Gianluca, Casadio Rita
Biocomputing Group, CIRI Health Sciences & Technologies (HST), University of Bologna, Bologna, Italy.
Methods Mol Biol. 2015;1264:305-20. doi: 10.1007/978-1-4939-2257-4_27.
Computational methods are invaluable when protein sequences, directly derived from genomic data, need functional and structural annotation. Subcellular localization is a feature necessary for understanding the protein role and the compartment where the mature protein is active and very difficult to characterize experimentally. Mitochondrial proteins encoded on the cytosolic ribosomes carry specific patterns in the precursor sequence from where it is possible to recognize a peptide targeting the protein to its final destination. Here we discuss to which extent it is feasible to develop computational methods for detecting mitochondrial targeting peptides in the precursor sequences and benchmark our and other methods on the human mitochondrial proteins endowed with experimentally characterized targeting peptides. Furthermore, we illustrate our newly implemented web server and its usage on the whole human proteome in order to infer mitochondrial targeting peptides, their cleavage sites, and whether the targeting peptide regions contain or not arginine-rich recurrent motifs. By this, we add some other 2,800 human proteins to the 124 ones already experimentally annotated with a mitochondrial targeting peptide.
当需要对直接从基因组数据中获得的蛋白质序列进行功能和结构注释时,计算方法非常重要。亚细胞定位是理解蛋白质作用以及成熟蛋白质发挥活性的区室所必需的一个特征,并且通过实验很难进行表征。由胞质核糖体编码的线粒体蛋白质在前体序列中携带特定模式,从中可以识别将蛋白质靶向其最终目的地的肽段。在此,我们讨论开发用于检测前体序列中线粒体靶向肽的计算方法的可行性,并在具有经实验表征的靶向肽的人类线粒体蛋白质上对我们的方法和其他方法进行基准测试。此外,我们展示了我们新实现的网络服务器及其在整个人类蛋白质组上的使用方法,以便推断线粒体靶向肽、它们的切割位点,以及靶向肽区域是否包含富含精氨酸的重复基序。通过这样做,我们在已经通过实验注释有一个线粒体靶向肽的124种蛋白质基础上又增加了约2800种人类蛋白质。