Suppr超能文献

通过在深度学习和 PSSM 特征中加入超参数优化来鉴定网格蛋白蛋白。

Identification of clathrin proteins by incorporating hyperparameter optimization in deep learning and PSSM profiles.

机构信息

Medical Humanities Research Cluster, School of Humanities, Nanyang Technological University, 48 Nanyang Ave, 639798 Singapore.

Department of Electrical Electronic and Mechanical Engineering, Lac Hong University, No. 10 Huynh Van Nghe Road, Bien Hoa, Dong Nai, Vietnam.

出版信息

Comput Methods Programs Biomed. 2019 Aug;177:81-88. doi: 10.1016/j.cmpb.2019.05.016. Epub 2019 May 17.

Abstract

BACKGROUND AND OBJECTIVES

Clathrin is an adaptor protein that serves as the principal element of the vesicle-coating complex and is important for the membrane cleavage to dispense the invaginated vesicle from the plasma membrane. The functional loss of clathrins has been tied to a lot of human diseases, i.e., neurodegenerative disorders, cancer, Alzheimer's diseases, and so on. Therefore, creating a precise model to identify its functions is a crucial step towards understanding human diseases and designing drug targets.

METHODS

We present a deep learning model using a two-dimensional convolutional neural network (CNN) and position-specific scoring matrix (PSSM) profiles to identify clathrin proteins from high throughput sequences. Traditionally, the 2D CNNs take images as an input so we treated the PSSM profile with a 20 × 20 matrix as an image of 20 × 20 pixels. The input PSSM profile was then connected to our 2D CNN in which we set a variety of parameters to improve the performance of the model. Based on the 10-fold cross-validation results, hyper-parameter optimization process was employed to find the best model for our dataset. Finally, an independent dataset was used to assess the predictive ability of the current model.

RESULTS

Our model could identify clathrin proteins with sensitivity of 92.2%, specificity of 91.2%, accuracy of 91.8%, and MCC of 0.83 in the independent dataset. Compared to state-of-the-art traditional neural networks, our method achieved a significant improvement in all typical measurement metrics.

CONCLUSIONS

Throughout the proposed study, we provide an effective tool for investigating clathrin proteins and our achievement could promote the use of deep learning in biomedical research. We also provide source codes and dataset freely at https://www.github.com/khanhlee/deep-clathrin/.

摘要

背景和目的

网格蛋白是衔接蛋白,是囊泡包被复合物的主要元件,对于从质膜上释放出内陷的囊泡非常重要。网格蛋白功能的丧失与许多人类疾病有关,例如神经退行性疾病、癌症、阿尔茨海默病等。因此,建立一个精确的模型来识别其功能是理解人类疾病和设计药物靶点的关键步骤。

方法

我们提出了一种使用二维卷积神经网络(CNN)和位置特异性评分矩阵(PSSM)的深度学习模型,从高通量序列中识别网格蛋白蛋白。传统上,二维 CNN 以图像作为输入,因此我们将 20×20 矩阵的 PSSM 图谱视为 20×20 像素的图像。然后将输入的 PSSM 图谱连接到我们的二维 CNN 中,我们设置了各种参数来提高模型的性能。基于 10 倍交叉验证的结果,采用超参数优化过程来为我们的数据集找到最佳模型。最后,使用独立数据集评估当前模型的预测能力。

结果

在独立数据集中,我们的模型可以识别出 92.2%的敏感性、91.2%的特异性、91.8%的准确性和 0.83 的 MCC。与最先进的传统神经网络相比,我们的方法在所有典型的测量指标上都取得了显著的改进。

结论

在整个研究过程中,我们为研究网格蛋白蛋白提供了一种有效的工具,我们的成果可以促进深度学习在生物医学研究中的应用。我们还在 https://www.github.com/khanhlee/deep-clathrin/ 上免费提供了源代码和数据集。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验