• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GASIDN:具有多尺度特征融合的亚高尔基体蛋白鉴定。

GASIDN: identification of sub-Golgi proteins with multi-scale feature fusion.

机构信息

School of Information Science and Engineering, University of Jinan, Jinan, China.

Laboratory of Zoology, Graduate School of Bioresource and Bioenvironmental Sciences, Kyushu University, Fukuoka-shi, Fukuoka, Japan.

出版信息

BMC Genomics. 2024 Oct 30;25(1):1019. doi: 10.1186/s12864-024-10954-3.

DOI:10.1186/s12864-024-10954-3
PMID:39478465
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11526662/
Abstract

The Golgi apparatus is a crucial component of the inner membrane system in eukaryotic cells, playing a central role in protein biosynthesis. Dysfunction of the Golgi apparatus has been linked to neurodegenerative diseases. Accurate identification of sub-Golgi protein types is therefore essential for developing effective treatments for such diseases. Due to the expensive and time-consuming nature of experimental methods for identifying sub-Golgi protein types, various computational methods have been developed as identification tools. However, the majority of these methods rely solely on neighboring features in the protein sequence and neglect the crucial spatial structure information of the protein.To discover alternative methods for accurately identifying sub-Golgi proteins, we have developed a model called GASIDN. The GASIDN model extracts multi-dimension features by utilizing a 1D convolution module on protein sequences and a graph learning module on contact maps constructed from AlphaFold2.The model utilizes the deep representation learning model SeqVec to initialize protein sequences. GASIDN achieved accuracy values of 98.4% and 96.4% in independent testing and ten-fold cross-validation, respectively, outperforming the majority of previous predictors. To the best of our knowledge, this is the first method that utilizes multi-scale feature fusion to identify and locate sub-Golgi proteins. In order to assess the generalizability and scalability of our model, we conducted experiments to apply it in the identification of proteins from other organelles, including plant vacuoles and peroxisomes. The results obtained from these experiments demonstrated promising outcomes, indicating the effectiveness and versatility of our model. The source code and datasets can be accessed at https://github.com/SJNNNN/GASIDN .

摘要

高尔基体是真核细胞内膜系统的关键组成部分,在蛋白质生物合成中发挥核心作用。高尔基体功能障碍与神经退行性疾病有关。因此,准确识别亚高尔基体蛋白类型对于开发此类疾病的有效治疗方法至关重要。由于鉴定亚高尔基体蛋白类型的实验方法昂贵且耗时,因此开发了各种计算方法作为鉴定工具。然而,这些方法中的大多数仅依赖于蛋白质序列中的相邻特征,而忽略了蛋白质关键的空间结构信息。为了发现准确识别亚高尔基体蛋白的替代方法,我们开发了一种名为 GASIDN 的模型。GASIDN 模型通过在蛋白质序列上使用 1D 卷积模块和在由 AlphaFold2 构建的接触图上使用图学习模块来提取多维特征。该模型利用深度表示学习模型 SeqVec 初始化蛋白质序列。GASIDN 在独立测试和十折交叉验证中的准确率分别达到 98.4%和 96.4%,优于大多数先前的预测器。据我们所知,这是第一个利用多尺度特征融合来识别和定位亚高尔基体蛋白的方法。为了评估我们模型的泛化能力和可扩展性,我们进行了实验以将其应用于鉴定来自其他细胞器的蛋白质,包括植物液泡和过氧化物酶体。这些实验的结果表明了我们模型的有效性和多功能性。源代码和数据集可在 https://github.com/SJNNNN/GASIDN 上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/16afc9568bd3/12864_2024_10954_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/aa8e36aa9b60/12864_2024_10954_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/ec3177c9eadc/12864_2024_10954_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/20576518df11/12864_2024_10954_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/b18f3d6eb164/12864_2024_10954_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/053746061026/12864_2024_10954_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/ba47c57dea34/12864_2024_10954_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/ff42854f899d/12864_2024_10954_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/001fa65433aa/12864_2024_10954_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/c3ba95607114/12864_2024_10954_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/16afc9568bd3/12864_2024_10954_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/aa8e36aa9b60/12864_2024_10954_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/ec3177c9eadc/12864_2024_10954_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/20576518df11/12864_2024_10954_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/b18f3d6eb164/12864_2024_10954_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/053746061026/12864_2024_10954_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/ba47c57dea34/12864_2024_10954_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/ff42854f899d/12864_2024_10954_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/001fa65433aa/12864_2024_10954_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/c3ba95607114/12864_2024_10954_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d08a/11526662/16afc9568bd3/12864_2024_10954_Fig10_HTML.jpg

相似文献

1
GASIDN: identification of sub-Golgi proteins with multi-scale feature fusion.GASIDN:具有多尺度特征融合的亚高尔基体蛋白鉴定。
BMC Genomics. 2024 Oct 30;25(1):1019. doi: 10.1186/s12864-024-10954-3.
2
Identification of plant vacuole proteins by using graph neural network and contact maps.利用图神经网络和接触图鉴定植物液泡蛋白。
BMC Bioinformatics. 2023 Sep 22;24(1):357. doi: 10.1186/s12859-023-05475-x.
3
Identification of sub-Golgi protein localization by use of deep representation learning features.利用深度表征学习特征鉴定高尔基体亚结构蛋白定位
Bioinformatics. 2021 Apr 5;36(24):5600-5609. doi: 10.1093/bioinformatics/btaa1074.
4
isGPT: An optimized model to identify sub-Golgi protein types using SVM and Random Forest based feature selection.isGPT:一种基于 SVM 和随机森林特征选择的亚高尔基体蛋白类型识别优化模型。
Artif Intell Med. 2018 Jan;84:90-100. doi: 10.1016/j.artmed.2017.11.003. Epub 2017 Nov 26.
5
A Novel Feature Extraction Method with Feature Selection to Identify Golgi-Resident Protein Types from Imbalanced Data.一种新型的特征提取方法,具有特征选择功能,可从不平衡数据中识别出高尔基驻留蛋白类型。
Int J Mol Sci. 2016 Feb 6;17(2):218. doi: 10.3390/ijms17020218.
6
Protein Subcellular Localization Prediction Model Based on Graph Convolutional Network.基于图卷积网络的蛋白质亚细胞定位预测模型
Interdiscip Sci. 2022 Dec;14(4):937-946. doi: 10.1007/s12539-022-00529-9. Epub 2022 Jun 17.
7
TAWFN: a deep learning framework for protein function prediction.TAWFN:一种用于蛋白质功能预测的深度学习框架。
Bioinformatics. 2024 Oct 1;40(10). doi: 10.1093/bioinformatics/btae571.
8
Predicting miRNA-disease association via graph attention learning and multiplex adaptive modality fusion.通过图注意力学习和多复用自适应模态融合预测 miRNA-疾病关联。
Comput Biol Med. 2024 Feb;169:107904. doi: 10.1016/j.compbiomed.2023.107904. Epub 2023 Dec 28.
9
MFSC: Multi-voting based feature selection for classification of Golgi proteins by adopting the general form of Chou's PseAAC components.MFSC:基于多投票的特征选择,通过采用 Chou 的 PseAAC 成分的通用形式对高尔基蛋白进行分类。
J Theor Biol. 2019 Feb 21;463:99-109. doi: 10.1016/j.jtbi.2018.12.017. Epub 2018 Dec 15.
10
Identification of plant vacuole proteins by exploiting deep representation learning features.利用深度表征学习特征鉴定植物液泡蛋白
Comput Struct Biotechnol J. 2022 Jun 8;20:2921-2927. doi: 10.1016/j.csbj.2022.06.002. eCollection 2022.

本文引用的文献

1
Evolutionary-scale prediction of atomic-level protein structure with a language model.用语言模型进行原子级蛋白质结构的进化尺度预测。
Science. 2023 Mar 17;379(6637):1123-1130. doi: 10.1126/science.ade2574. Epub 2023 Mar 16.
2
DeePVP: Identification and classification of phage virion proteins using deep learning.基于深度学习的噬菌体病毒蛋白鉴定与分类
Gigascience. 2022 Aug 11;11. doi: 10.1093/gigascience/giac076.
3
Identification of plant vacuole proteins by exploiting deep representation learning features.利用深度表征学习特征鉴定植物液泡蛋白
Comput Struct Biotechnol J. 2022 Jun 8;20:2921-2927. doi: 10.1016/j.csbj.2022.06.002. eCollection 2022.
4
HLAB: learning the BiLSTM features from the ProtBert-encoded proteins for the class I HLA-peptide binding prediction.HLAB:从 ProtBert 编码的蛋白质中学习 BiLSTM 特征,用于预测 I 类 HLA-肽结合。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac173.
5
Accurate protein function prediction via graph attention networks with predicted structure information.通过结合预测结构信息的图注意力网络进行准确的蛋白质功能预测。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab502.
6
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.
7
In-Pero: Exploiting Deep Learning Embeddings of Protein Sequences to Predict the Localisation of Peroxisomal Proteins.In-Pero:利用蛋白质序列的深度学习嵌入来预测过氧化物酶体蛋白的定位。
Int J Mol Sci. 2021 Jun 15;22(12):6409. doi: 10.3390/ijms22126409.
8
Structure-based protein function prediction using graph convolutional networks.基于结构的蛋白质功能预测使用图卷积网络。
Nat Commun. 2021 May 26;12(1):3168. doi: 10.1038/s41467-021-23303-9.
9
Anticancer peptides prediction with deep representation learning features.基于深度表示学习特征的抗癌肽预测。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab008.
10
Sequence representation approaches for sequence-based protein prediction tasks that use deep learning.用于基于序列的蛋白质预测任务的序列表示方法,这些任务使用深度学习。
Brief Funct Genomics. 2021 Mar 2;20(1):61-73. doi: 10.1093/bfgp/elaa030.