Department of Bioscience and Bioinformatics, Kyushu Institute of Technology, Iizuka, Fukuoka 820-8502, Japan.
Japan Society for the Promotion of Science, Chiyoda-ku, Tokyo 102-0083, Japan.
Bioinformatics. 2020 Jun 1;36(11):3350-3356. doi: 10.1093/bioinformatics/btaa160.
Therapeutic peptides failing at clinical trials could be attributed to their toxicity profiles like hemolytic activity, which hamper further progress of peptides as drug candidates. The accurate prediction of hemolytic peptides (HLPs) and its activity from the given peptides is one of the challenging tasks in immunoinformatics, which is essential for drug development and basic research. Although there are a few computational methods that have been proposed for this aspect, none of them are able to identify HLPs and their activities simultaneously.
In this study, we proposed a two-layer prediction framework, called HLPpred-Fuse, that can accurately and automatically predict both hemolytic peptides (HLPs or non-HLPs) as well as HLPs activity (high and low). More specifically, feature representation learning scheme was utilized to generate 54 probabilistic features by integrating six different machine learning classifiers and nine different sequence-based encodings. Consequently, the 54 probabilistic features were fused to provide sufficiently converged sequence information which was used as an input to extremely randomized tree for the development of two final prediction models which independently identify HLP and its activity. Performance comparisons over empirical cross-validation analysis, independent test and case study against state-of-the-art methods demonstrate that HLPpred-Fuse consistently outperformed these methods in the identification of hemolytic activity.
For the convenience of experimental scientists, a web-based tool has been established at http://thegleelab.org/HLPpred-Fuse.
glee@ajou.ac.kr or watshara.sho@mahidol.ac.th or bala@ajou.ac.kr.
Supplementary data are available at Bioinformatics online.
临床试验中失败的治疗性肽可归因于其毒性特征,如溶血活性,这阻碍了肽作为候选药物的进一步发展。从给定的肽中准确预测溶血肽(HLP)及其活性是免疫信息学中的一项具有挑战性的任务,这对于药物开发和基础研究至关重要。尽管已经提出了几种用于这方面的计算方法,但它们都无法同时识别 HLP 和它们的活性。
在这项研究中,我们提出了一种两层预测框架,称为 HLPpred-Fuse,它可以准确且自动地同时预测溶血肽(HLP 或非 HLP)及其活性(高和低)。更具体地说,特征表示学习方案用于通过集成六个不同的机器学习分类器和九个不同的基于序列的编码来生成 54 个概率特征。因此,54 个概率特征被融合以提供足够收敛的序列信息,该信息被用作极端随机树的输入,以开发两个最终的预测模型,这些模型独立地识别 HLP 和其活性。与最先进方法的经验交叉验证分析、独立测试和案例研究相比,性能比较表明,HLPpred-Fuse 在识别溶血活性方面始终优于这些方法。
为了方便实验科学家,我们在 http://thegleelab.org/HLPpred-Fuse 上建立了一个基于网络的工具。
glee@ajou.ac.kr 或 watshara.sho@mahidol.ac.th 或 bala@ajou.ac.kr。
补充数据可在 Bioinformatics 在线获得。