Suppr超能文献

基于序列的通道蛋白特征表示学习和生物学功能预测。

Sequence-Based Prediction with Feature Representation Learning and Biological Function Analysis of Channel Proteins.

机构信息

Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, 610054 Chengdu, Sichuan, China.

School of Applied Chemistry and Biological Technology, Shenzhen Polytechnic, 518055 Shenzhen, Guangdong, China.

出版信息

Front Biosci (Landmark Ed). 2022 Jun 2;27(6):177. doi: 10.31083/j.fbl2706177.

Abstract

BACKGROUND

Channel proteins are proteins that can transport molecules past the plasma membrane through free diffusion movement. Due to the cost of labor and experimental methods, developing a tool to identify channel proteins is necessary for biological research on channel proteins.

METHODS

17 feature coding methods and four machine learning classifiers to generate 68-dimensional data probability features. Then, the two-step feature selection strategy was used to optimize the features, and the final prediction Model M16-LGBM (light gradient boosting machine) was obtained on the 16-dimensional optimal feature vector.

RESULTS

A new predictor, CAPs-LGBM, was proposed to identify the channel proteins effectively.

CONCLUSIONS

CAPs-LGBM is the first channel protein machine learning predictor was used to construct the final prediction model based on protein primary sequences. The classifier performed well in the training and test sets.

摘要

背景

通道蛋白是能够通过自由扩散运动将分子运输过质膜的蛋白质。由于劳动力和实验方法的成本,开发一种识别通道蛋白的工具对于通道蛋白的生物学研究是必要的。

方法

使用 17 种特征编码方法和四种机器学习分类器生成 68 维数据概率特征。然后,使用两步特征选择策略对特征进行优化,最终在 16 维最优特征向量上得到预测模型 M16-LGBM(轻梯度提升机)。

结果

提出了一种新的预测器 CAPs-LGBM,用于有效识别通道蛋白。

结论

CAPs-LGBM 是第一个基于蛋白质一级序列构建最终预测模型的通道蛋白机器学习预测器。该分类器在训练集和测试集上表现良好。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验