iPTM-mLys：鉴定多个赖氨酸 PTM 位点及其不同类型。

iPTM-mLys: identifying multiple lysine PTM sites and their different types.

机构信息

Computer Department, Jingdezhen Ceramic Institute, Jingdezhen 333403, China, Department of Computer Science and Bond Life Science Center, University of Missouri, Columbia MO, USA; Computational Biology, Gordon Life Science Institute, Boston, MA 02478, USA.

Computer Department, Jingdezhen Ceramic Institute, Jingdezhen 333403, China.

出版信息

Bioinformatics. 2016 Oct 15;32(20):3116-3123. doi: 10.1093/bioinformatics/btw380. Epub 2016 Jun 22.

DOI:10.1093/bioinformatics/btw380

PMID:27334473

Abstract

MOTIVATION

Post-translational modification, abbreviated as PTM, refers to the change of the amino acid side chains of a protein after its biosynthesis. Owing to its significance for in-depth understanding various biological processes and developing effective drugs, prediction of PTM sites in proteins have currently become a hot topic in bioinformatics. Although many computational methods were established to identify various single-label PTM types and their occurrence sites in proteins, no method has ever been developed for multi-label PTM types. As one of the most frequently observed PTMs, the K-PTM, namely, the modification occurring at lysine (K), can be usually accommodated with many different types, such as 'acetylation', 'crotonylation', 'methylation' and 'succinylation'. Now we are facing an interesting challenge: given an uncharacterized protein sequence containing many K residues, which ones can accommodate two or more types of PTM, which ones only one, and which ones none?

RESULTS

To address this problem, a multi-label predictor called IPTM-MLYS: has been developed. It represents the first multi-label PTM predictor ever established. The novel predictor is featured by incorporating the sequence-coupled effects into the general PseAAC, and by fusing an array of basic random forest classifiers into an ensemble system. Rigorous cross-validations via a set of multi-label metrics indicate that the first multi-label PTM predictor is very promising and encouraging.

AVAILABILITY AND IMPLEMENTATION

For the convenience of most experimental scientists, a user-friendly web-server for iPTM-mLys has been established at http://www.jci-bioinfo.cn/iPTM-mLys, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved.

CONTACT

wqiu@gordonlifescience.org, xxiao@gordonlifescience.org, kcchou@gordonlifescience.orgSupplementary information: Supplementary data are available at Bioinformatics online.

摘要

动机

翻译后修饰，缩写为 PTM，是指蛋白质生物合成后氨基酸侧链的变化。由于其对深入了解各种生物过程和开发有效药物的重要性，目前蛋白质中 PTM 位点的预测已成为生物信息学的一个热门话题。虽然已经建立了许多计算方法来识别蛋白质中的各种单标签 PTM 类型及其发生位点，但从未开发出用于多标签 PTM 类型的方法。作为最常观察到的 PTM 之一，K-PTM，即赖氨酸（K）的修饰，可以容纳许多不同的类型，如“乙酰化”、“巴豆酰化”、“甲基化”和“琥珀酰化”。现在我们面临着一个有趣的挑战：给定一个含有许多 K 残基的未表征蛋白质序列，哪些残基可以容纳两种或更多种类型的 PTM，哪些残基只能容纳一种，哪些残基不能容纳任何一种？

结果

为了解决这个问题，开发了一种称为 IPTM-MLYS 的多标签预测器。它代表了第一个建立的多标签 PTM 预测器。该新型预测器的特点是将序列耦合效应纳入一般 PseAAC 中，并将一系列基本随机森林分类器融合到一个集成系统中。通过一系列多标签指标进行的严格交叉验证表明，第一个多标签 PTM 预测器非常有前途和令人鼓舞。