基于氨基酸组成和二肽组成对核受体进行分类。

Classification of nuclear receptors based on amino acid composition and dipeptide composition.

作者信息

Bhasin Manoj, Raghava Gajendra P S

机构信息

Institute of Microbial Technology, Chandigarh 160036, India.

出版信息

J Biol Chem. 2004 May 28;279(22):23262-6. doi: 10.1074/jbc.M401932200. Epub 2004 Mar 23.

DOI:10.1074/jbc.M401932200

PMID:15039428

Abstract

Nuclear receptors are key transcription factors that regulate crucial gene networks responsible for cell growth, differentiation, and homeostasis. Nuclear receptors form a superfamily of phylogenetically related proteins and control functions associated with major diseases (e.g. diabetes, osteoporosis, and cancer). In this study, a novel method has been developed for classifying the subfamilies of nuclear receptors. The classification was achieved on the basis of amino acid and dipeptide composition from a sequence of receptors using support vector machines. The training and testing was done on a non-redundant data set of 282 proteins obtained from the NucleaRDB data base (1). The performance of all classifiers was evaluated using a 5-fold cross validation test. In the 5-fold cross-validation, the data set was randomly partitioned into five equal sets and evaluated five times on each distinct set while keeping the remaining four sets for training. It was found that different subfamilies of nuclear receptors were quite closely correlated in terms of amino acid composition as well as dipeptide composition. The overall accuracy of amino acid composition-based and dipeptide composition-based classifiers were 82.6 and 97.5%, respectively. Therefore, our results prove that different subfamilies of nuclear receptors are predictable with considerable accuracy using amino acid or dipeptide composition. Furthermore, based on above approach, an online web service, NRpred, was developed, which is available at www.imtech.res.in/raghava/nrpred.

摘要

核受体是关键的转录因子，可调节负责细胞生长、分化和体内平衡的关键基因网络。核受体形成了一个由系统发育相关蛋白质组成的超家族，并控制与主要疾病（如糖尿病、骨质疏松症和癌症）相关的功能。在本研究中，开发了一种用于对核受体亚家族进行分类的新方法。该分类是基于受体序列中的氨基酸和二肽组成，使用支持向量机实现的。训练和测试是在从NucleaRDB数据库（1）获得的282种蛋白质的非冗余数据集上进行的。所有分类器的性能均使用5折交叉验证测试进行评估。在5折交叉验证中，数据集被随机划分为五个相等的集合，并在每个不同的集合上进行五次评估，同时保留其余四个集合用于训练。结果发现，核受体的不同亚家族在氨基酸组成以及二肽组成方面密切相关。基于氨基酸组成和基于二肽组成的分类器的总体准确率分别为82.6%和97.5%。因此，我们的结果证明，使用氨基酸或二肽组成可以相当准确地预测核受体的不同亚家族。此外，基于上述方法，开发了一个在线网络服务NRpred，可在www.imtech.res.in/raghava/nrpred上获取。

相似文献

Classification of nuclear receptors based on amino acid composition and dipeptide composition.基于氨基酸组成和二肽组成对核受体进行分类。

J Biol Chem. 2004 May 28;279(22):23262-6. doi: 10.1074/jbc.M401932200. Epub 2004 Mar 23.

Improving the classification of nuclear receptors with feature selection.通过特征选择改进核受体的分类。

Protein Pept Lett. 2009;16(7):823-9. doi: 10.2174/092986609788681733.

GPCRsclass: a web tool for the classification of amine type of G-protein-coupled receptors.GPCRs分类：一种用于胺类G蛋白偶联受体分类的网络工具。

Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W143-7. doi: 10.1093/nar/gki351.

Prediction of nuclear receptors with optimal pseudo amino acid composition.基于最优伪氨基酸组成的核受体预测。

Anal Biochem. 2009 Apr 1;387(1):54-9. doi: 10.1016/j.ab.2009.01.018. Epub 2009 Jan 19.

NR-2L: a two-level predictor for identifying nuclear receptor subfamilies based on sequence-derived features.NR-2L：一种基于序列衍生特征识别核受体亚家族的两级预测器。

PLoS One. 2011;6(8):e23505. doi: 10.1371/journal.pone.0023505. Epub 2011 Aug 15.

ESLpred: SVM-based method for subcellular localization of eukaryotic proteins using dipeptide composition and PSI-BLAST.ESLpred：基于支持向量机的方法，利用二肽组成和PSI-BLAST对真核蛋白质进行亚细胞定位。

Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W414-9. doi: 10.1093/nar/gkh350.

Support vector machine (SVM) based multiclass prediction with basic statistical analysis of plasminogen activators.基于支持向量机（SVM）的多类预测及纤溶酶原激活剂的基本统计分析

BMC Res Notes. 2014 Jan 27;7:63. doi: 10.1186/1756-0500-7-63.

Accurate prediction of nuclear receptors with conjoint triad feature.利用联合三联体特征准确预测核受体。

BMC Bioinformatics. 2015 Dec 3;16:402. doi: 10.1186/s12859-015-0828-1.

Classification of G-protein coupled receptors at four levels.G蛋白偶联受体的四级分类。

Protein Eng Des Sel. 2006 Nov;19(11):511-6. doi: 10.1093/protein/gzl038. Epub 2006 Oct 10.

Oxypred: prediction and classification of oxygen-binding proteins.Oxypred：氧结合蛋白的预测与分类

Genomics Proteomics Bioinformatics. 2007 Dec;5(3-4):250-2. doi: 10.1016/S1672-0229(08)60012-1.

引用本文的文献

An artificial intelligence-based approach for identifying the proteins regulating liquid-liquid phase separation.一种基于人工智能的方法用于识别调节液-液相分离的蛋白质。

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf313.

A genetic algorithm-based ensemble model for efficiently identifying interleukin 6 inducing peptides.一种基于遗传算法的集成模型，用于高效识别白细胞介素6诱导肽。

Sci Rep. 2025 Jul 1;15(1):21213. doi: 10.1038/s41598-025-05491-2.

Enhancing the Feature Representation of Protein Sequence Descriptors in Protein-Protein Interaction Prediction.在蛋白质-蛋白质相互作用预测中增强蛋白质序列描述符的特征表示

Interdiscip Sci. 2025 Jun 2. doi: 10.1007/s12539-025-00723-5.

TFProtBert: Detection of Transcription Factors Binding to Methylated DNA Using ProtBert Latent Space Representation.TFProtBert：利用ProtBert潜在空间表示法检测与甲基化DNA结合的转录因子

Int J Mol Sci. 2025 Apr 29;26(9):4234. doi: 10.3390/ijms26094234.

iNClassSec-ESM: Discovering potential non-classical secreted proteins through a novel protein language model.iNClassSec-ESM：通过一种新型蛋白质语言模型发现潜在的非经典分泌蛋白。

Comput Struct Biotechnol J. 2025 Mar 28;27:1350-1358. doi: 10.1016/j.csbj.2025.03.043. eCollection 2025.

Predicting amyloid proteins using attention-based long short-term memory.使用基于注意力机制的长短期记忆网络预测淀粉样蛋白。

PeerJ Comput Sci. 2025 Feb 7;11:e2660. doi: 10.7717/peerj-cs.2660. eCollection 2025.

iAMP-CRA: Identifying Antimicrobial Peptides Using Convolutional Recurrent Neural Network with Self-Attention.iAMP-CRA：使用带有自注意力机制的卷积循环神经网络识别抗菌肽

Health Inf Sci Syst. 2025 Mar 5;13(1):25. doi: 10.1007/s13755-025-00342-w. eCollection 2025 Dec.

DLBWE-Cys: a deep-learning-based tool for identifying cysteine S-carboxyethylation sites using binary-weight encoding.DLBWE-Cys：一种基于深度学习的工具，用于使用二进制权重编码识别半胱氨酸S-羧乙基化位点。

Front Genet. 2025 Jan 8;15:1464976. doi: 10.3389/fgene.2024.1464976. eCollection 2024.

SProtFP: a machine learning-based method for functional classification of small ORFs in prokaryotes.SProtFP：一种基于机器学习的原核生物中小开放阅读框功能分类方法。

NAR Genom Bioinform. 2025 Jan 7;7(1):lqae186. doi: 10.1093/nargab/lqae186. eCollection 2025 Mar.

HPClas: A data-driven approach for identifying halophilic proteins based on catBoost.HPClas：一种基于CatBoost的数据驱动型嗜盐蛋白识别方法。

mLife. 2024 Jul 20;3(4):515-526. doi: 10.1002/mlf2.12125. eCollection 2024 Dec.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于氨基酸组成和二肽组成对核受体进行分类。

Classification of nuclear receptors based on amino acid composition and dipeptide composition.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献