Suppr超能文献

基于序列特征图和图卷积神经网络的必需基因识别模型。

Essential genes identification model based on sequence feature map and graph convolutional neural network.

机构信息

College of Physics and Electronic Information, Gannan Normal University, Ganzhou, Jiangxi, 341000, China.

出版信息

BMC Genomics. 2024 Jan 10;25(1):47. doi: 10.1186/s12864-024-09958-w.

Abstract

BACKGROUND

Essential genes encode functions that play a vital role in the life activities of organisms, encompassing growth, development, immune system functioning, and cell structure maintenance. Conventional experimental techniques for identifying essential genes are resource-intensive and time-consuming, and the accuracy of current machine learning models needs further enhancement. Therefore, it is crucial to develop a robust computational model to accurately predict essential genes.

RESULTS

In this study, we introduce GCNN-SFM, a computational model for identifying essential genes in organisms, based on graph convolutional neural networks (GCNN). GCNN-SFM integrates a graph convolutional layer, a convolutional layer, and a fully connected layer to model and extract features from gene sequences of essential genes. Initially, the gene sequence is transformed into a feature map using coding techniques. Subsequently, a multi-layer GCN is employed to perform graph convolution operations, effectively capturing both local and global features of the gene sequence. Further feature extraction is performed, followed by integrating convolution and fully-connected layers to generate prediction results for essential genes. The gradient descent algorithm is utilized to iteratively update the cross-entropy loss function, thereby enhancing the accuracy of the prediction results. Meanwhile, model parameters are tuned to determine the optimal parameter combination that yields the best prediction performance during training.

CONCLUSIONS

Experimental evaluation demonstrates that GCNN-SFM surpasses various advanced essential gene prediction models and achieves an average accuracy of 94.53%. This study presents a novel and effective approach for identifying essential genes, which has significant implications for biology and genomics research.

摘要

背景

必需基因编码的功能对生物体的生命活动起着至关重要的作用,包括生长、发育、免疫系统功能和细胞结构维持。传统的鉴定必需基因的实验技术资源密集且耗时,并且当前机器学习模型的准确性需要进一步提高。因此,开发一种准确预测必需基因的强大计算模型至关重要。

结果

在这项研究中,我们引入了基于图卷积神经网络(GCNN)的 GCNN-SFM,这是一种用于鉴定生物体内必需基因的计算模型。GCNN-SFM 集成了图卷积层、卷积层和全连接层,用于对必需基因的基因序列进行建模和提取特征。首先,使用编码技术将基因序列转换为特征图。然后,使用多层 GCN 执行图卷积操作,有效地捕获基因序列的局部和全局特征。进一步进行特征提取,然后集成卷积和全连接层以生成必需基因的预测结果。使用梯度下降算法迭代更新交叉熵损失函数,从而提高预测结果的准确性。同时,调整模型参数以确定在训练过程中产生最佳预测性能的最佳参数组合。

结论

实验评估表明,GCNN-SFM 优于各种先进的必需基因预测模型,平均准确率达到 94.53%。这项研究提出了一种鉴定必需基因的新颖而有效的方法,对生物学和基因组学研究具有重要意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5af0/10777564/24dca90c6c8b/12864_2024_9958_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验