Suppr超能文献

基于机器学习的细胞穿透肽预测及其摄取效率的改进准确性。

Machine-Learning-Based Prediction of Cell-Penetrating Peptides and Their Uptake Efficiency with Improved Accuracy.

机构信息

Department of Physiology , Ajou University School of Medicine , Suwon 443380 , Republic of Korea.

Research and Development Center , Insilicogen Inc. , Yongin-si, Suwon 441813 , Republic of Korea.

出版信息

J Proteome Res. 2018 Aug 3;17(8):2715-2726. doi: 10.1021/acs.jproteome.8b00148. Epub 2018 Jul 2.

Abstract

Cell-penetrating peptides (CPPs) can enter cells as a variety of biologically active conjugates and have various biomedical applications. To offset the cost and effort of designing novel CPPs in laboratories, computational methods are necessitated to identify candidate CPPs before in vitro experimental studies. We developed a two-layer prediction framework called machine-learning-based prediction of cell-penetrating peptides (MLCPPs). The first-layer predicts whether a given peptide is a CPP or non-CPP, whereas the second-layer predicts the uptake efficiency of the predicted CPPs. To construct a two-layer prediction framework, we employed four different machine-learning methods and five different compositions including amino acid composition (AAC), dipeptide composition, amino acid index, composition-transition-distribution, and physicochemical properties (PCPs). In the first layer, hybrid features (combination of AAC and PCP) and extremely randomized tree outperformed state-of-the-art predictors in CPP prediction with an accuracy of 0.896 when tested on independent data sets, whereas in the second layer, hybrid features obtained through feature selection protocol and random forest produced an accuracy of 0.725 that is better than state-of-the-art predictors. We anticipate that our method MLCPP will become a valuable tool for predicting CPPs and their uptake efficiency and might facilitate hypothesis-driven experimental design. The MLCPP server interface along with the benchmarking and independent data sets are freely accessible at www.thegleelab.org/MLCPP .

摘要

细胞穿透肽 (CPPs) 可以作为各种具有生物活性的缀合物进入细胞,并具有多种生物医学应用。为了弥补在实验室中设计新型 CPP 的成本和工作量,需要计算方法来识别体外实验研究之前的候选 CPP。我们开发了一种称为基于机器学习的细胞穿透肽预测 (MLCPPs) 的两层预测框架。第一层预测给定肽是否为 CPP 或非 CPP,而第二层预测预测 CPP 的摄取效率。为了构建两层预测框架,我们使用了四种不同的机器学习方法和五种不同的组成部分,包括氨基酸组成 (AAC)、二肽组成、氨基酸指数、组成-转换-分布和物理化学性质 (PCPs)。在第一层中,混合特征(AAC 和 PCP 的组合)和极端随机树在 CPP 预测中表现优于最先进的预测器,在独立数据集上测试时准确率为 0.896,而在第二层中,通过特征选择协议和随机森林获得的混合特征产生了 0.725 的准确率,优于最先进的预测器。我们预计我们的 MLCPP 方法将成为预测 CPP 及其摄取效率的有价值工具,并可能有助于假设驱动的实验设计。MLCPP 服务器接口以及基准数据集和独立数据集可在 www.thegleelab.org/MLCPP 上免费访问。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验