• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

MSlocPRED:基于深度迁移学习的多标签 mRNA 亚细胞定位识别。

MSlocPRED: deep transfer learning-based identification of multi-label mRNA subcellular localization.

机构信息

School of Artificial Intelligence and Computer Science, Jiangnan University, No. 1800 Lihu Avenue, Binhu District, Wuxi 214000, China.

School of Artificial Intelligence, Hebei University of Technology, 5340 Xiping Road, Beichen District, Tianjin 300130, China.

出版信息

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae504.

DOI:10.1093/bib/bbae504
PMID:39401145
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11472759/
Abstract

Subcellular localization of messenger ribonucleic acid (mRNA) is a universal mechanism for precise and efficient control of the translation process. Although many computational methods have been constructed by researchers for predicting mRNA subcellular localization, very few of these computational methods have been designed to predict subcellular localization with multiple localization annotations, and their generalization performance could be improved. In this study, the prediction model MSlocPRED was constructed to identify multi-label mRNA subcellular localization. First, the preprocessed Dataset 1 and Dataset 2 are transformed into the form of images. The proposed MDNDO-SMDU resampling technique is then used to balance the number of samples in each category in the training dataset. Finally, deep transfer learning was used to construct the predictive model MSlocPRED to identify subcellular localization for 16 classes (Dataset 1) and 18 classes (Dataset 2). The results of comparative tests of different resampling techniques show that the resampling technique proposed in this study is more effective in preprocessing for subcellular localization. The prediction results of the datasets constructed by intercepting different NC end (Both the 5' and 3' untranslated regions that flank the protein-coding sequence and influence mRNA function without encoding proteins themselves.) lengths show that for Dataset 1 and Dataset 2, the prediction performance is best when the NC end is intercepted by 35 nucleotides, respectively. The results of both independent testing and five-fold cross-validation comparisons with established prediction tools show that MSlocPRED is significantly better than established tools for identifying multi-label mRNA subcellular localization. Additionally, to understand how the MSlocPRED model works during the prediction process, SHapley Additive exPlanations was used to explain it. The predictive model and associated datasets are available on the following github: https://github.com/ZBYnb1/MSlocPRED/tree/main.

摘要

信使核糖核酸(mRNA)的亚细胞定位是精确和有效控制翻译过程的通用机制。尽管研究人员已经构建了许多用于预测 mRNA 亚细胞定位的计算方法,但很少有这些计算方法被设计用于预测具有多个定位注释的亚细胞定位,并且它们的泛化性能可以得到提高。在这项研究中,构建了预测模型 MSlocPRED 来识别多标签 mRNA 亚细胞定位。首先,将预处理后的数据集 1 和数据集 2 转换为图像形式。然后,使用提出的 MDNDO-SMDU 重采样技术来平衡训练数据集中每个类别的样本数量。最后,使用深度迁移学习来构建预测模型 MSlocPRED,以识别 16 类(数据集 1)和 18 类(数据集 2)的亚细胞定位。不同重采样技术的比较测试结果表明,本研究提出的重采样技术在亚细胞定位的预处理中更有效。通过截取不同 NC 端(侧翼编码序列的 5'和 3'非翻译区,影响 mRNA 功能而不编码蛋白质本身)长度构建数据集的预测结果表明,对于数据集 1 和数据集 2,当 NC 端分别被 35 个核苷酸截取时,预测性能最佳。与已建立的预测工具进行独立测试和五重交叉验证比较的结果表明,MSlocPRED 用于识别多标签 mRNA 亚细胞定位的性能明显优于已建立的工具。此外,为了了解 MSlocPRED 模型在预测过程中的工作方式,使用 SHapley Additive exPlanations 对其进行了解释。预测模型和相关数据集可在以下 github 上获得:https://github.com/ZBYnb1/MSlocPRED/tree/main。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/0999b1500769/bbae504f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/718c47ead0fb/bbae504f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/64c43b19ca07/bbae504f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/7d2df3ab588f/bbae504f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/26cf02072233/bbae504f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/236ea8e13730/bbae504f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/1b4458a503a1/bbae504f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/8b5599a9402b/bbae504f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/50de58d23548/bbae504f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/f80e2908b9c7/bbae504f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/6cdaed86a262/bbae504f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/ddb025ff8fca/bbae504f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/0999b1500769/bbae504f12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/718c47ead0fb/bbae504f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/64c43b19ca07/bbae504f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/7d2df3ab588f/bbae504f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/26cf02072233/bbae504f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/236ea8e13730/bbae504f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/1b4458a503a1/bbae504f6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/8b5599a9402b/bbae504f7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/50de58d23548/bbae504f8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/f80e2908b9c7/bbae504f9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/6cdaed86a262/bbae504f10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/ddb025ff8fca/bbae504f11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2a64/11472759/0999b1500769/bbae504f12.jpg

相似文献

1
MSlocPRED: deep transfer learning-based identification of multi-label mRNA subcellular localization.MSlocPRED:基于深度迁移学习的多标签 mRNA 亚细胞定位识别。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae504.
2
mRNA-CLA: An interpretable deep learning approach for predicting mRNA subcellular localization.mRNA-CLA:一种用于预测 mRNA 亚细胞定位的可解释深度学习方法。
Methods. 2024 Jul;227:17-26. doi: 10.1016/j.ymeth.2024.04.018. Epub 2024 May 3.
3
DRpred: A Novel Deep Learning-Based Predictor for Multi-Label mRNA Subcellular Localization Prediction by Incorporating Bayesian Inferred Prior Label Relationships.DRpred:一种新型的深度学习预测器,通过纳入贝叶斯推断的先验标签关系,用于多标签 mRNA 亚细胞定位预测。
Biomolecules. 2024 Aug 26;14(9):1067. doi: 10.3390/biom14091067.
4
DeepLocRNA: an interpretable deep learning model for predicting RNA subcellular localization with domain-specific transfer-learning.DeepLocRNA:一种具有领域特定迁移学习功能的可解释深度学习模型,用于预测 RNA 亚细胞定位。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae065.
5
Unified mRNA Subcellular Localization Predictor based on machine learning techniques.基于机器学习技术的统一 mRNA 亚细胞定位预测器。
BMC Genomics. 2024 Feb 7;25(1):151. doi: 10.1186/s12864-024-10077-9.
6
Gene ontology based transfer learning for protein subcellular localization.基于基因本体论的蛋白质亚细胞定位迁移学习。
BMC Bioinformatics. 2011 Feb 2;12:44. doi: 10.1186/1471-2105-12-44.
7
MulStack: An ensemble learning prediction model of multilabel mRNA subcellular localization.MulStack:一种用于多标签 mRNA 亚细胞定位的集成学习预测模型。
Comput Biol Med. 2024 Jun;175:108289. doi: 10.1016/j.compbiomed.2024.108289. Epub 2024 Mar 16.
8
DM3Loc: multi-label mRNA subcellular localization prediction and analysis based on multi-head self-attention mechanism.DM3Loc:基于多头自注意力机制的多标签 mRNA 亚细胞定位预测与分析。
Nucleic Acids Res. 2021 May 7;49(8):e46. doi: 10.1093/nar/gkab016.
9
LncLocFormer: a Transformer-based deep learning model for multi-label lncRNA subcellular localization prediction by using localization-specific attention mechanism.LncLocFormer:一种基于 Transformer 的深度学习模型,通过使用定位特异性注意力机制,对多标签 lncRNA 亚细胞定位进行预测。
Bioinformatics. 2023 Dec 1;39(12). doi: 10.1093/bioinformatics/btad752.
10
MSLP: mRNA subcellular localization predictor based on machine learning techniques.MSLP:基于机器学习技术的 mRNA 亚细胞定位预测器。
BMC Bioinformatics. 2023 Mar 22;24(1):109. doi: 10.1186/s12859-023-05232-0.

本文引用的文献

1
A BERT-based model for the prediction of lncRNA subcellular localization in Homo sapiens.基于 BERT 的人类长非编码 RNA 亚细胞定位预测模型。
Int J Biol Macromol. 2024 Apr;265(Pt 1):130659. doi: 10.1016/j.ijbiomac.2024.130659. Epub 2024 Mar 10.
2
Identifying disease-related microbes based on multi-scale variational graph autoencoder embedding Wasserstein distance.基于多尺度变分图自动编码器嵌入 Wasserstein 距离的疾病相关微生物识别。
BMC Biol. 2023 Dec 20;21(1):294. doi: 10.1186/s12915-023-01796-8.
3
Sequence Alignment/Map format: a comprehensive review of approaches and applications.
序列比对/映射格式:方法和应用的全面综述。
Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad320.
4
BioSeq-Diabolo: Biological sequence similarity analysis using Diabolo.BioSeq-Diabolo:使用 Diabolo 进行生物序列相似性分析。
PLoS Comput Biol. 2023 Jun 20;19(6):e1011214. doi: 10.1371/journal.pcbi.1011214. eCollection 2023 Jun.
5
DeepmRNALoc: A Novel Predictor of Eukaryotic mRNA Subcellular Localization Based on Deep Learning.DeepmRNALoc:基于深度学习的真核 mRNA 亚细胞定位新预测因子。
Molecules. 2023 Mar 1;28(5):2284. doi: 10.3390/molecules28052284.
6
DeepBIO: an automated and interpretable deep-learning platform for high-throughput biological sequence prediction, functional annotation and visualization analysis.DeepBIO:一个自动化的、可解释的深度学习平台,用于高通量生物序列预测、功能注释和可视化分析。
Nucleic Acids Res. 2023 Apr 24;51(7):3017-3029. doi: 10.1093/nar/gkad055.
7
RNAlight: a machine learning model to identify nucleotide features determining RNA subcellular localization.RNA 亮氨酸:一种用于识别决定 RNA 亚细胞定位的核苷酸特征的机器学习模型。
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac509.
8
Clarion is a multi-label problem transformation method for identifying mRNA subcellular localizations.Clarion 是一种多标签问题转换方法,用于识别 mRNA 亚细胞定位。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac467.
9
Shared subspace-based radial basis function neural network for identifying ncRNAs subcellular localization.基于共享子空间的径向基函数神经网络用于识别 ncRNA 亚细胞定位。
Neural Netw. 2022 Dec;156:170-178. doi: 10.1016/j.neunet.2022.09.026. Epub 2022 Oct 10.
10
iLoc-miRNA: extracellular/intracellular miRNA prediction using deep BiLSTM with attention mechanism.iLoc-miRNA:使用具有注意力机制的深度 BiLSTM 进行细胞外/细胞内 miRNA 预测。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac395.