Suppr超能文献

CapsNet-TIS:基于多特征融合和改进胶囊网络的翻译起始位点预测。

CapsNet-TIS: Predicting translation initiation site based on multi-feature fusion and improved capsule network.

机构信息

College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China.

College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China.

出版信息

Gene. 2024 Oct 5;924:148598. doi: 10.1016/j.gene.2024.148598. Epub 2024 May 22.

Abstract

Genes are the basic units of protein synthesis in organisms, and accurately identifying the translation initiation site (TIS) of genes is crucial for understanding the regulation, transcription, and translation processes of genes. However, the existing models cannot adequately extract the feature information in TIS sequences, and they also inadequately capture the complex hierarchical relationships among features. Therefore, a novel predictor named CapsNet-TIS is proposed in this paper. CapsNet-TIS first fully extracts the TIS sequence information using four encoding methods, including One-hot encoding, physical structure property (PSP) encoding, nucleotide chemical property (NCP) encoding, and nucleotide density (ND) encoding. Next, multi-scale convolutional neural networks are used to perform feature fusion of the encoded features to enhance the comprehensiveness of the feature representation. Finally, the fused features are classified using capsule network as the main network of the classification model to capture the complex hierarchical relationships among the features. Moreover, we improve the capsule network by introducing residual block, channel attention, and BiLSTM to enhance the model's feature extraction and sequence data modeling capabilities. In this paper, the performance of CapsNet-TIS is evaluated using TIS datasets from four species: human, mouse, bovine, and fruit fly, and the effectiveness of each part is demonstrated by performing ablation experiments. By comparing the experimental results with models proposed by other researchers, the results demonstrate the superior performance of CapsNet-TIS.

摘要

基因是生物体中蛋白质合成的基本单位,准确识别基因的翻译起始位点(TIS)对于理解基因的调控、转录和翻译过程至关重要。然而,现有的模型无法充分提取 TIS 序列中的特征信息,也无法充分捕捉特征之间的复杂层次关系。因此,本文提出了一种名为 CapsNet-TIS 的新型预测器。CapsNet-TIS 首先使用四种编码方法(包括 One-hot 编码、物理结构特性(PSP)编码、核苷酸化学特性(NCP)编码和核苷酸密度(ND)编码)充分提取 TIS 序列信息。接下来,使用多尺度卷积神经网络对编码特征进行特征融合,增强特征表示的全面性。最后,使用胶囊网络作为分类模型的主要网络对融合特征进行分类,以捕获特征之间的复杂层次关系。此外,我们通过引入残差块、通道注意力和 BiLSTM 来改进胶囊网络,以增强模型的特征提取和序列数据建模能力。本文使用来自四个物种(人类、小鼠、牛和果蝇)的 TIS 数据集评估了 CapsNet-TIS 的性能,并通过进行消融实验证明了每个部分的有效性。通过将实验结果与其他研究人员提出的模型进行比较,结果表明 CapsNet-TIS 的性能更优。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验