Qiu Wang-Ren, Xiao Xuan, Xu Zhao-Chun, Chou Kuo-Chen
Computer Department, Jingdezhen Ceramic Institute, Jingdezhen, China.
Department of Computer Science and Bond Life Science Center, University of Missouri, Columbia, MO, USA.
Oncotarget. 2016 Aug 9;7(32):51270-51283. doi: 10.18632/oncotarget.9987.
Protein phosphorylation is a posttranslational modification (PTM or PTLM), where a phosphoryl group is added to the residue(s) of a protein molecule. The most commonly phosphorylated amino acids occur at serine (S), threonine (T), and tyrosine (Y). Protein phosphorylation plays a significant role in a wide range of cellular processes; meanwhile its dysregulation is also involved with many diseases. Therefore, from the angles of both basic research and drug development, we are facing a challenging problem: for an uncharacterized protein sequence containing many residues of S, T, or Y, which ones can be phosphorylated, and which ones cannot? To address this problem, we have developed a predictor called iPhos-PseEn by fusing four different pseudo component approaches (amino acids' disorder scores, nearest neighbor scores, occurrence frequencies, and position weights) into an ensemble classifier via a voting system. Rigorous cross-validations indicated that the proposed predictor remarkably outperformed its existing counterparts. For the convenience of most experimental scientists, a user-friendly web-server for iPhos-PseEn has been established at http://www.jci-bioinfo.cn/iPhos-PseEn, by which users can easily obtain their desired results without the need to go through the complicated mathematical equations involved.
蛋白质磷酸化是一种翻译后修饰(PTM或PTLM),其中一个磷酰基被添加到蛋白质分子的残基上。最常发生磷酸化的氨基酸是丝氨酸(S)、苏氨酸(T)和酪氨酸(Y)。蛋白质磷酸化在广泛的细胞过程中发挥着重要作用;同时,其失调也与许多疾病有关。因此,从基础研究和药物开发的角度来看,我们面临一个具有挑战性的问题:对于一个含有许多S、T或Y残基的未表征蛋白质序列,哪些残基可以被磷酸化,哪些不能?为了解决这个问题,我们开发了一种名为iPhos-PseEn的预测器,通过投票系统将四种不同的伪组分方法(氨基酸的无序得分、最近邻得分、出现频率和位置权重)融合到一个集成分类器中。严格的交叉验证表明,所提出的预测器明显优于现有的同类预测器。为了方便大多数实验科学家,已在http://www.jci-bioinfo.cn/iPhos-PseEn建立了一个用户友好的iPhos-PseEn网络服务器,用户通过该服务器可以轻松获得所需结果,而无需处理所涉及的复杂数学方程。