The Institute of Pharmacology, Key Laboratory of Preclinical Study for New Drugs of Gansu Province, School of Basic Medical Sciences, Lanzhou University, Lanzhou, Gansu 730000, China.
School of Stomatology, Lanzhou University,Lanzhou, Gansu 730000, China.
J Chem Inf Model. 2022 May 23;62(10):2617-2629. doi: 10.1021/acs.jcim.2c00089. Epub 2022 May 9.
Although peptides are regarded as ideal therapeutic agents, only a small proportion of the marketed drugs are peptides. In the past decade, pharmacists have paid great attention to the development of peptide therapeutics. Except a few approved chemically/rationally designed peptides, most attempts failed due to unsatisfactory efficacy or safety. Luckily, computation methods, such as artificial intelligence, have been utilized to accelerate the discovery of therapeutic peptides by predicting the activity, toxicity, and absorption, distribution, metabolism, and excretion of polypeptides. Usually, a specific biological activity of a peptide could be accurately determined by an interest-oriented binary classification constructed of a positive set and another un-experimentally validated negative set regardless of other characteristics, which suggests that it could be challenging to realize the comprehensive evaluation of the research object in the early stage of drug research and development. Herein, we proposed an integrated method (GM-Pep) that contained a conditional variational autoencoder model (CVAE) and a positive sample training multiclassifier (Deep-Multiclassifier) to effectively generate a single bioactive peptide sequence without toxicity and referential side effects. The results showed that our Deep-Multiclassifier model gave a sequence accuracy of up to 96.41% [toxicity (94.48%), antifungal (96.58%), antihypertensive (97.18%), and antibacterial (96.91%), respectively]. The properties of Deep-Multiclassifier and CVAE were validated through 12 first synthesized antibacterial peptides or compared to random peptides. The source code and data sets are available at https://github.com/TimothyChen225/GM-Pep.
虽然肽被认为是理想的治疗剂,但在市售药物中只有一小部分是肽。在过去的十年中,药剂师非常关注肽治疗药物的发展。除了少数几种经过化学/合理设计批准的肽外,由于疗效或安全性不理想,大多数尝试都失败了。幸运的是,计算方法,如人工智能,已经被用于通过预测多肽的活性、毒性以及吸收、分布、代谢和排泄来加速治疗性肽的发现。通常,通过构建一个由阳性集和另一个未经实验验证的阴性集组成的有针对性的二进制分类,可以准确地确定肽的特定生物活性,无论其他特征如何,这表明在药物研发的早期阶段,对研究对象进行全面评估可能具有挑战性。在这里,我们提出了一种综合方法(GM-Pep),该方法包含条件变分自动编码器模型(CVAE)和阳性样本训练多分类器(Deep-Multiclassifier),可以有效地生成无毒性和参考副作用的单一生物活性肽序列。结果表明,我们的 Deep-Multiclassifier 模型的序列准确率高达 96.41%[毒性(94.48%)、抗真菌(96.58%)、降压(97.18%)和抗菌(96.91%)]。通过合成的 12 种抗菌肽或与随机肽进行比较,验证了 Deep-Multiclassifier 和 CVAE 的特性。源代码和数据集可在 https://github.com/TimothyChen225/GM-Pep 上获得。