Suppr超能文献

利用卷积神经网络对拟南芥泛素化位点进行计算识别。

Computational identification of ubiquitination sites in Arabidopsis thaliana using convolutional neural networks.

作者信息

Wang Xiaofeng, Yan Renxiang, Chen Yong-Zi, Wang Yongji

机构信息

College of Mathematics and Computer Sciences, Shanxi Normal University, Linfen, 041004, China.

School of Biological Sciences and Engineering, Fujian Key Laboratory of Marine Enzyme Engineering, Fuzhou University, Fuzhou, 350002, China.

出版信息

Plant Mol Biol. 2021 Apr;105(6):601-610. doi: 10.1007/s11103-020-01112-w. Epub 2021 Feb 1.

Abstract

We developed two CNNs for predicting ubiquitination sites in Arabidopsis thaliana, demonstrated their competitive performance, analyzed amino acid physicochemical properties and the CNN structures, and predicted ubiquitination sites in Arabidopsis. As an important posttranslational protein modification, ubiquitination plays critical roles in plant physiology, including plant growth and development, biotic and abiotic stress, metabolism, and so on. A lot of ubiquitination site prediction models have been developed for human, mouse and yeast. However, there are few models to predict ubiquitination sites for the plant Arabidopsis thaliana. Based on this context, we proposed two convolutional neural network (CNN) based models for predicting ubiquitination sites in A. thaliana. The two models reach AUC (area under the ROC curve) values of 0.924 and 0.913 respectively in five-fold cross-validation, and 0.921 and 0.914 respectively in independent test, which outperform other models and demonstrate the competitive edge of them. We in-depth analyze the amino acid physicochemical properties in the neighboring sequence regions of the ubiquitination sites, and study the influence of the CNN structure to the prediction performance. Potential ubiquitination sites in the global Arbidopsis proteome are predicted using the two CNN models. To facilitate the community, the source code, training and test dataset, predicted ubiquitination sites in the Arbidopsis proteome are available at GitHub ( http://github.com/nongdaxiaofeng/CNNAthUbi ) for interest users.

摘要

我们开发了两个用于预测拟南芥中泛素化位点的卷积神经网络(CNN),展示了它们的竞争性能,分析了氨基酸理化性质和CNN结构,并预测了拟南芥中的泛素化位点。作为一种重要的蛋白质翻译后修饰,泛素化在植物生理学中发挥着关键作用,包括植物生长发育、生物和非生物胁迫、新陈代谢等。已经为人类、小鼠和酵母开发了许多泛素化位点预测模型。然而,用于预测植物拟南芥泛素化位点的模型却很少。基于此背景,我们提出了两个基于卷积神经网络(CNN)的模型来预测拟南芥中的泛素化位点。在五折交叉验证中,这两个模型的AUC(ROC曲线下面积)值分别达到0.924和0.913,在独立测试中分别为0.921和0.914,优于其他模型并展示了它们的竞争优势。我们深入分析了泛素化位点相邻序列区域的氨基酸理化性质,并研究了CNN结构对预测性能的影响。使用这两个CNN模型预测了拟南芥全蛋白质组中的潜在泛素化位点。为方便同行使用,相关源代码、训练和测试数据集以及拟南芥蛋白质组中预测的泛素化位点可在GitHub(http://github.com/nongdaxiaofeng/CNNAthUbi)上供感兴趣的用户使用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验