Suppr超能文献

Transformer 助力表现提升:CPPFormer 在细胞穿透肽精准预测中的应用。

Better Performance with Transformer: CPPFormer in the Precise Prediction of Cell-penetrating Peptides.

机构信息

Department of Computer Science, University of Tsukuba, Tsukuba, Japan.

School of Software, Shandong University, Jinan, China.

出版信息

Curr Med Chem. 2022;29(5):881-893. doi: 10.2174/0929867328666210920103140.

Abstract

Owing to its superior performance, the Transformer model, based on the 'Encoder- Decoder' paradigm, has become the mainstream model in natural language processing. However, bioinformatics has embraced machine learning and has led to remarkable progress in drug design and protein property prediction. Cell-penetrating peptides (CPPs) are a type of permeable protein that is a convenient 'postman' in drug penetration tasks. However, only a few CPPs have been discovered, limiting their practical applications in drug permeability. CPPs have led to a new approach that enables the uptake of only macromolecules into cells (i.e., without other potentially harmful materials found in the drug). Most previous studies have utilized trivial machine learning techniques and hand-crafted features to construct a simple classifier. CPPFormer was constructed by implementing the attention structure of the Transformer, rebuilding the network based on the characteristics of CPPs according to their short length, and using an automatic feature extractor with a few manually engineered features to co-direct the predicted results. Compared to all previous methods and other classic text classification models, the empirical results show that our proposed deep model-based method achieves the best performance, with an accuracy of 92.16% in the CPP924 dataset, and passes various index tests.

摘要

由于其卓越的性能,基于“编码器-解码器”范式的 Transformer 模型已成为自然语言处理领域的主流模型。然而,生物信息学已经接受了机器学习,并在药物设计和蛋白质性质预测方面取得了显著的进展。细胞穿透肽(CPPs)是一种可穿透的蛋白质,是药物穿透任务中的方便“邮差”。然而,只有少数 CPPs 被发现,限制了它们在药物渗透性方面的实际应用。CPPs 开辟了一种新的方法,使只有大分子能够进入细胞(即,没有药物中发现的其他潜在有害物质)。大多数先前的研究都利用琐碎的机器学习技术和手工制作的特征来构建简单的分类器。CPPFormer 通过实现 Transformer 的注意力结构来构建,根据 CPPs 的特点,根据其短长度重建网络,并使用具有几个手动设计特征的自动特征提取器来共同指导预测结果。与所有以前的方法和其他经典文本分类模型相比,实验结果表明,我们提出的基于深度学习的方法表现最佳,在 CPP924 数据集上的准确率为 92.16%,并通过了各种指标测试。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验