Suppr超能文献

Fertility-GRU:通过整合深度门控循环单元和原始位置特定评分矩阵谱来识别与生育力相关的蛋白质。

Fertility-GRU: Identifying Fertility-Related Proteins by Incorporating Deep-Gated Recurrent Units and Original Position-Specific Scoring Matrix Profiles.

机构信息

Medical Humanities Research Cluster, School of Humanities , Nanyang Technological University , 48 Nanyang Ave , Singapore 639798.

出版信息

J Proteome Res. 2019 Sep 6;18(9):3503-3511. doi: 10.1021/acs.jproteome.9b00411. Epub 2019 Aug 7.

Abstract

Protein function prediction is one of the well-known problems in proteome research, attracting the attention of numerous researchers. However, the implementation of deep neural networks, which helps to increase the protein function prediction, still poses a big challenge. This study proposes a deep learning approach namely Fertility-GRU that incorporates gated recurrent units and position-specific scoring matrix profiles to predict the function of fertility-related protein, which is a highly crucial biological function. Fertility-related proteins also have been proven to be important in many biological entities (i.e., bone marrow and peripheral blood, postnatal mammalian ovary) and parameters (i.e., daily sperm production). As a result, our model can achieve a cross-validation accuracy of 85.8% and an independent accuracy of 91.1%. We also solve the problem of overfitting in the data set by adding dropout layers in the deep learning model. The independent testing results showed sensitivity, specificity, and Matthews correlation coefficient (MCC) values of 90.5%, 91.7%, and 0.82, respectively. Fertility-GRU demonstrates superiority in performance against the state-of-the-art predictor on the same data set. In our proposed study, we provided a method that enables more proteins to be discovered, especially proteins associated with fertility. Moreover, our achievement could promote the use of recurrent networks and gated recurrent units in proteome research. The source code and data set are freely accessible via https://github.com/khanhlee/fertility-gru .

摘要

蛋白质功能预测是蛋白质组研究中众所周知的问题之一,吸引了众多研究人员的关注。然而,深度神经网络的实施有助于提高蛋白质功能预测,仍然是一个巨大的挑战。本研究提出了一种深度学习方法,即 Fertility-GRU,它结合了门控循环单元和位置特异性评分矩阵谱,以预测与生育相关的蛋白质的功能,这是一个非常重要的生物学功能。与生育相关的蛋白质也已被证明在许多生物实体(即骨髓和外周血、产后哺乳动物卵巢)和参数(即每日精子生成)中很重要。因此,我们的模型可以实现 85.8%的交叉验证准确性和 91.1%的独立准确性。我们还通过在深度学习模型中添加辍学层来解决数据集的过拟合问题。独立测试结果显示敏感性、特异性和 Matthews 相关系数(MCC)值分别为 90.5%、91.7%和 0.82。Fertility-GRU 在与同一数据集上的最新预测器的性能方面表现出优越性。在我们的研究中,我们提供了一种方法,可以发现更多的蛋白质,特别是与生育相关的蛋白质。此外,我们的研究成果可以促进在蛋白质组研究中使用递归网络和门控循环单元。源代码和数据集可通过 https://github.com/khanhlee/fertility-gru 免费获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验