• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于 Chou 的 5 步法则的多标签学习识别 RNA 相关亚细胞定位。

Identify RNA-associated subcellular localizations based on multi-label learning using Chou's 5-steps rule.

机构信息

School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.

School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China.

出版信息

BMC Genomics. 2021 Jan 15;22(1):56. doi: 10.1186/s12864-020-07347-7.

DOI:10.1186/s12864-020-07347-7
PMID:33451286
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7811227/
Abstract

BACKGROUND

Biological functions of biomolecules rely on the cellular compartments where they are located in cells. Importantly, RNAs are assigned in specific locations of a cell, enabling the cell to implement diverse biochemical processes in the way of concurrency. However, lots of existing RNA subcellular localization classifiers only solve the problem of single-label classification. It is of great practical significance to expand RNA subcellular localization into multi-label classification problem.

RESULTS

In this study, we extract multi-label classification datasets about RNA-associated subcellular localizations on various types of RNAs, and then construct subcellular localization datasets on four RNA categories. In order to study Homo sapiens, we further establish human RNA subcellular localization datasets. Furthermore, we utilize different nucleotide property composition models to extract effective features to adequately represent the important information of nucleotide sequences. In the most critical part, we achieve a major challenge that is to fuse the multivariate information through multiple kernel learning based on Hilbert-Schmidt independence criterion. The optimal combined kernel can be put into an integration support vector machine model for identifying multi-label RNA subcellular localizations. Our method obtained excellent results of 0.703, 0.757, 0.787, and 0.800, respectively on four RNA data sets on average precision.

CONCLUSION

To be specific, our novel method performs outstanding rather than other prediction tools on novel benchmark datasets. Moreover, we establish user-friendly web server with the implementation of our method.

摘要

背景

生物分子的生物学功能依赖于它们在细胞中所处的细胞区室。重要的是,RNA 被分配到细胞的特定位置,使细胞能够以并发性的方式实施多种生化过程。然而,许多现有的 RNA 亚细胞定位分类器仅解决了单标签分类的问题。将 RNA 亚细胞定位扩展到多标签分类问题具有重要的实际意义。

结果

在这项研究中,我们提取了关于各种类型 RNA 相关亚细胞定位的多标签分类数据集,然后构建了四个 RNA 类别的亚细胞定位数据集。为了研究智人,我们进一步建立了人类 RNA 亚细胞定位数据集。此外,我们利用不同的核苷酸特性组成模型来提取有效特征,以充分表示核苷酸序列的重要信息。在最关键的部分,我们通过基于 Hilbert-Schmidt 独立性准则的多核学习来实现融合多元信息的重大挑战。最优组合核可用于集成支持向量机模型,以识别多标签 RNA 亚细胞定位。我们的方法在四个 RNA 数据集上的平均精度分别达到了 0.703、0.757、0.787 和 0.800。

结论

具体来说,我们的新方法在新的基准数据集上的表现明显优于其他预测工具。此外,我们还建立了一个用户友好的网络服务器,实现了我们的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/5cc52b8119ee/12864_2020_7347_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/b9126a0119bf/12864_2020_7347_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/777c1d6b37db/12864_2020_7347_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/8eee7121681c/12864_2020_7347_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/53e665808fae/12864_2020_7347_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/2dec640dc2fb/12864_2020_7347_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/c958c602c41f/12864_2020_7347_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/78295b133d41/12864_2020_7347_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/6b587abe8b65/12864_2020_7347_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/5cc52b8119ee/12864_2020_7347_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/b9126a0119bf/12864_2020_7347_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/777c1d6b37db/12864_2020_7347_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/8eee7121681c/12864_2020_7347_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/53e665808fae/12864_2020_7347_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/2dec640dc2fb/12864_2020_7347_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/c958c602c41f/12864_2020_7347_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/78295b133d41/12864_2020_7347_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/6b587abe8b65/12864_2020_7347_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8cf7/7811227/5cc52b8119ee/12864_2020_7347_Fig9_HTML.jpg

相似文献

1
Identify RNA-associated subcellular localizations based on multi-label learning using Chou's 5-steps rule.基于 Chou 的 5 步法则的多标签学习识别 RNA 相关亚细胞定位。
BMC Genomics. 2021 Jan 15;22(1):56. doi: 10.1186/s12864-020-07347-7.
2
Identify ncRNA Subcellular Localization via Graph Regularized k-Local Hyperplane Distance Nearest Neighbor Model on Multi-Kernel Learning.基于多核学习的图正则化k-局部超平面距离最近邻模型识别非编码RNA亚细胞定位
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3517-3529. doi: 10.1109/TCBB.2021.3107621. Epub 2022 Dec 8.
3
Use of Chou's 5-steps rule to predict the subcellular localization of gram-negative and gram-positive bacterial proteins by multi-label learning based on gene ontology annotation and profile alignment.利用 Chou 的 5 步规则,通过基于基因本体论注释和序列比对的多标签学习,预测革兰氏阴性和革兰氏阳性细菌蛋白质的亚细胞定位。
J Integr Bioinform. 2020 Jun 29;18(1):51-79. doi: 10.1515/jib-2019-0091.
4
Identification of protein subcellular localization via integrating evolutionary and physicochemical information into Chou's general PseAAC.通过将进化和物理化学信息整合到 Chou 的通用 PseAAC 中,鉴定蛋白质亚细胞定位。
J Theor Biol. 2019 Feb 7;462:230-239. doi: 10.1016/j.jtbi.2018.11.012. Epub 2018 Nov 16.
5
pLoc_bal-mVirus: Predict Subcellular Localization of Multi-Label Virus Proteins by Chou's General PseAAC and IHTS Treatment to Balance Training Dataset.pLoc_bal-mVirus:基于周式广义伪氨基酸组成和用于平衡训练数据集的迭代启发式阈值选择处理预测多标签病毒蛋白的亚细胞定位
Med Chem. 2019;15(5):496-509. doi: 10.2174/1573406415666181217114710.
6
Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features.基于多视图特征融合的蛋白质亚细胞定位预测。
Molecules. 2019 Mar 6;24(5):919. doi: 10.3390/molecules24050919.
7
ncRNALocate-EL: a multi-label ncRNA subcellular locality prediction model based on ensemble learning.ncRNALocate-EL:一种基于集成学习的多标签 ncRNA 亚细胞定位预测模型。
Brief Funct Genomics. 2023 Nov 10;22(5):442-452. doi: 10.1093/bfgp/elad007.
8
Predicting plant protein subcellular multi-localization by Chou's PseAAC formulation based multi-label homolog knowledge transfer learning.基于 Chou 的 PseAAC 构象的多标签同源知识转移学习预测植物蛋白质亚细胞多定位。
J Theor Biol. 2012 Oct 7;310:80-7. doi: 10.1016/j.jtbi.2012.06.028. Epub 2012 Jun 27.
9
Protein subcellular localization prediction using multiple kernel learning based support vector machine.基于多核学习支持向量机的蛋白质亚细胞定位预测
Mol Biosyst. 2017 Mar 28;13(4):785-795. doi: 10.1039/c6mb00860g.
10
EuLoc: a web-server for accurately predict protein subcellular localization in eukaryotes by incorporating various features of sequence segments into the general form of Chou's PseAAC.EuLoc:一个通过将序列片段的各种特征纳入到 Chou 的 PseAAC 的通用形式中,从而准确预测真核生物蛋白质亚细胞定位的网络服务器。
J Comput Aided Mol Des. 2013 Jan;27(1):91-103. doi: 10.1007/s10822-012-9628-0. Epub 2013 Jan 3.

引用本文的文献

1
A Comprehensive Review on RNA Subcellular Localization Prediction.RNA亚细胞定位预测综述
ArXiv. 2025 Apr 24:arXiv:2504.17162v1.
2
LncLSTA: a versatile predictor unveiling subcellular localization of lncRNAs through long-short term attention.LncLSTA:一种通过长短期注意力揭示lncRNA亚细胞定位的多功能预测工具。
Bioinform Adv. 2024 Nov 22;5(1):vbae173. doi: 10.1093/bioadv/vbae173. eCollection 2025.
3
DRpred: A Novel Deep Learning-Based Predictor for Multi-Label mRNA Subcellular Localization Prediction by Incorporating Bayesian Inferred Prior Label Relationships.

本文引用的文献

1
cncRNAdb: a manually curated resource of experimentally supported RNAs with both protein-coding and noncoding function.ncRNAdb:一个经过人工整理的实验支持的 RNA 资源,具有蛋白质编码和非编码功能。
Nucleic Acids Res. 2021 Jan 8;49(D1):D65-D70. doi: 10.1093/nar/gkaa791.
2
Design powerful predictor for mRNA subcellular location prediction in Homo sapiens.设计用于预测人类 mRNA 亚细胞定位的强大预测器。
Brief Bioinform. 2021 Jan 18;22(1):526-535. doi: 10.1093/bib/bbz177.
3
iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data.
DRpred:一种新型的深度学习预测器,通过纳入贝叶斯推断的先验标签关系,用于多标签 mRNA 亚细胞定位预测。
Biomolecules. 2024 Aug 26;14(9):1067. doi: 10.3390/biom14091067.
4
Inference of gene regulatory networks based on directed graph convolutional networks.基于有向图卷积网络的基因调控网络推断。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae309.
5
GP-HTNLoc: A graph prototype head-tail network-based model for multi-label subcellular localization prediction of ncRNAs.GP-HTNLoc:一种基于图原型头-尾网络的非编码RNA多标签亚细胞定位预测模型。
Comput Struct Biotechnol J. 2024 May 3;23:2034-2048. doi: 10.1016/j.csbj.2024.04.052. eCollection 2024 Dec.
6
Fuzzy kernel evidence Random Forest for identifying pseudouridine sites.基于模糊核证据的随机森林算法用于鉴定假尿嘧啶核苷位点。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae169.
7
Clarion is a multi-label problem transformation method for identifying mRNA subcellular localizations.Clarion 是一种多标签问题转换方法,用于识别 mRNA 亚细胞定位。
Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac467.
8
EL-RMLocNet: An explainable LSTM network for RNA-associated multi-compartment localization prediction.EL-RMLocNet:一种用于RNA相关多隔室定位预测的可解释长短期记忆网络。
Comput Struct Biotechnol J. 2022 Jul 26;20:3986-4002. doi: 10.1016/j.csbj.2022.07.031. eCollection 2022.
9
Identification of Vesicle Transport Proteins Hypergraph Regularized K-Local Hyperplane Distance Nearest Neighbour Model.囊泡转运蛋白的鉴定 超图正则化K局部超平面距离最近邻模型
Front Genet. 2022 Jul 13;13:960388. doi: 10.3389/fgene.2022.960388. eCollection 2022.
10
Circ-LocNet: A Computational Framework for Circular RNA Sub-Cellular Localization Prediction.Circ-LocNet:一种用于环状 RNA 亚细胞定位预测的计算框架。
Int J Mol Sci. 2022 Jul 26;23(15):8221. doi: 10.3390/ijms23158221.
iLearn:一个集成平台和元学习者,用于 DNA、RNA 和蛋白质序列数据的特征工程、机器学习分析和建模。
Brief Bioinform. 2020 May 21;21(3):1047-1057. doi: 10.1093/bib/bbz041.
4
iRNA-PseKNC(2methyl): Identify RNA 2'-O-methylation sites by convolution neural network and Chou's pseudo components.iRNA-PseKNC(2methyl):通过卷积神经网络和周的伪成分识别 RNA 2'-O-甲基化位点。
J Theor Biol. 2019 Mar 21;465:1-6. doi: 10.1016/j.jtbi.2018.12.034. Epub 2018 Dec 24.
5
pLoc_bal-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by General PseAAC and Quasi-balancing Training Dataset.pLoc_bal-mEuk:基于通用伪氨基酸组成和准平衡训练数据集预测真核生物蛋白质的亚细胞定位
Med Chem. 2019;15(5):472-485. doi: 10.2174/1573406415666181218102517.
6
Identification of Drug-Side Effect Association via Semisupervised Model and Multiple Kernel Learning.基于半监督模型和多核学习的药物副作用关联识别。
IEEE J Biomed Health Inform. 2019 Nov;23(6):2619-2632. doi: 10.1109/JBHI.2018.2883834. Epub 2018 Nov 28.
7
Identification of protein subcellular localization via integrating evolutionary and physicochemical information into Chou's general PseAAC.通过将进化和物理化学信息整合到 Chou 的通用 PseAAC 中,鉴定蛋白质亚细胞定位。
J Theor Biol. 2019 Feb 7;462:230-239. doi: 10.1016/j.jtbi.2018.11.012. Epub 2018 Nov 16.
8
iLoc-lncRNA: predict the subcellular location of lncRNAs by incorporating octamer composition into general PseKNC.iLoc-lncRNA:通过将八聚体组成纳入广义 PseKNC 来预测 lncRNA 的亚细胞位置。
Bioinformatics. 2018 Dec 15;34(24):4196-4204. doi: 10.1093/bioinformatics/bty508.
9
pLoc_bal-mGpos: Predict subcellular localization of Gram-positive bacterial proteins by quasi-balancing training dataset and PseAAC.pLoc_bal-mGpos:通过准平衡训练数据集和 PseAAC 预测革兰氏阳性菌蛋白质的亚细胞定位
Genomics. 2019 Jul;111(4):886-892. doi: 10.1016/j.ygeno.2018.05.017. Epub 2018 May 26.
10
MiRGOFS: a GO-based functional similarity measurement for miRNAs, with applications to the prediction of miRNA subcellular localization and miRNA-disease association.MiRGOFS:基于 GO 的 miRNA 功能相似性度量方法,可应用于 miRNA 亚细胞定位和 miRNA 疾病关联的预测。
Bioinformatics. 2018 Oct 15;34(20):3547-3556. doi: 10.1093/bioinformatics/bty343.