Computer Department, Jingdezhen Ceramic Institute, Jingdezhen 333403, China, Department of Computer Science and Bond Life Science Center, University of Missouri, Columbia MO, USA; Computational Biology, Gordon Life Science Institute, Boston, MA 02478, USA.
Computer Department, Jingdezhen Ceramic Institute, Jingdezhen 333403, China.
Bioinformatics. 2016 Oct 15;32(20):3116-3123. doi: 10.1093/bioinformatics/btw380. Epub 2016 Jun 22.
Post-translational modification, abbreviated as PTM, refers to the change of the amino acid side chains of a protein after its biosynthesis. Owing to its significance for in-depth understanding various biological processes and developing effective drugs, prediction of PTM sites in proteins have currently become a hot topic in bioinformatics. Although many computational methods were established to identify various single-label PTM types and their occurrence sites in proteins, no method has ever been developed for multi-label PTM types. As one of the most frequently observed PTMs, the K-PTM, namely, the modification occurring at lysine (K), can be usually accommodated with many different types, such as 'acetylation', 'crotonylation', 'methylation' and 'succinylation'. Now we are facing an interesting challenge: given an uncharacterized protein sequence containing many K residues, which ones can accommodate two or more types of PTM, which ones only one, and which ones none?
To address this problem, a multi-label predictor called IPTM-MLYS: has been developed. It represents the first multi-label PTM predictor ever established. The novel predictor is featured by incorporating the sequence-coupled effects into the general PseAAC, and by fusing an array of basic random forest classifiers into an ensemble system. Rigorous cross-validations via a set of multi-label metrics indicate that the first multi-label PTM predictor is very promising and encouraging.
For the convenience of most experimental scientists, a user-friendly web-server for iPTM-mLys has been established at http://www.jci-bioinfo.cn/iPTM-mLys, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved.
wqiu@gordonlifescience.org, xxiao@gordonlifescience.org, kcchou@gordonlifescience.orgSupplementary information: Supplementary data are available at Bioinformatics online.
翻译后修饰,缩写为 PTM,是指蛋白质生物合成后氨基酸侧链的变化。由于其对深入了解各种生物过程和开发有效药物的重要性,目前蛋白质中 PTM 位点的预测已成为生物信息学的一个热门话题。虽然已经建立了许多计算方法来识别蛋白质中的各种单标签 PTM 类型及其发生位点,但从未开发出用于多标签 PTM 类型的方法。作为最常观察到的 PTM 之一,K-PTM,即赖氨酸(K)的修饰,可以容纳许多不同的类型,如“乙酰化”、“巴豆酰化”、“甲基化”和“琥珀酰化”。现在我们面临着一个有趣的挑战:给定一个含有许多 K 残基的未表征蛋白质序列,哪些残基可以容纳两种或更多种类型的 PTM,哪些残基只能容纳一种,哪些残基不能容纳任何一种?
为了解决这个问题,开发了一种称为 IPTM-MLYS 的多标签预测器。它代表了第一个建立的多标签 PTM 预测器。该新型预测器的特点是将序列耦合效应纳入一般 PseAAC 中,并将一系列基本随机森林分类器融合到一个集成系统中。通过一系列多标签指标进行的严格交叉验证表明,第一个多标签 PTM 预测器非常有前途和令人鼓舞。
为了方便大多数实验科学家,我们在 http://www.jci-bioinfo.cn/iPTM-mLys 上建立了一个用户友好的 iPTM-mLys 网络服务器,用户可以轻松地获得他们想要的结果,而无需经历涉及的复杂数学方程。
wqiu@gordonlifescience.org,xxiao@gordonlifescience.org,kcchou@gordonlifescience.org
补充数据可在《生物信息学》在线获得。