Suppr超能文献

NR-2L:一种基于序列衍生特征识别核受体亚家族的两级预测器。

NR-2L: a two-level predictor for identifying nuclear receptor subfamilies based on sequence-derived features.

机构信息

Computer Department, Jing-De-Zhen Ceramic Institute, Jing-De-Zhen, China.

出版信息

PLoS One. 2011;6(8):e23505. doi: 10.1371/journal.pone.0023505. Epub 2011 Aug 15.

Abstract

Nuclear receptors (NRs) are one of the most abundant classes of transcriptional regulators in animals. They regulate diverse functions, such as homeostasis, reproduction, development and metabolism. Therefore, NRs are a very important target for drug development. Nuclear receptors form a superfamily of phylogenetically related proteins and have been subdivided into different subfamilies due to their domain diversity. In this study, a two-level predictor, called NR-2L, was developed that can be used to identify a query protein as a nuclear receptor or not based on its sequence information alone; if it is, the prediction will be automatically continued to further identify it among the following seven subfamilies: (1) thyroid hormone like (NR1), (2) HNF4-like (NR2), (3) estrogen like, (4) nerve growth factor IB-like (NR4), (5) fushi tarazu-F1 like (NR5), (6) germ cell nuclear factor like (NR6), and (7) knirps like (NR0). The identification was made by the Fuzzy K nearest neighbor (FK-NN) classifier based on the pseudo amino acid composition formed by incorporating various physicochemical and statistical features derived from the protein sequences, such as amino acid composition, dipeptide composition, complexity factor, and low-frequency Fourier spectrum components. As a demonstration, it was shown through some benchmark datasets derived from the NucleaRDB and UniProt with low redundancy that the overall success rates achieved by the jackknife test were about 93% and 89% in the first and second level, respectively. The high success rates indicate that the novel two-level predictor can be a useful vehicle for identifying NRs and their subfamilies. As a user-friendly web server, NR-2L is freely accessible at either http://icpr.jci.edu.cn/bioinfo/NR2L or http://www.jci-bioinfo.cn/NR2L. Each job submitted to NR-2L can contain up to 500 query protein sequences and be finished in less than 2 minutes. The less the number of query proteins is, the shorter the time will usually be. All the program codes for NR-2L are available for non-commercial purpose upon request.

摘要

核受体(NRs)是动物中最丰富的转录调控因子之一。它们调节多种功能,如内稳态、生殖、发育和代谢。因此,NRs 是药物开发的一个非常重要的靶点。核受体形成一个进化相关蛋白的超家族,并根据其结构域的多样性分为不同的亚家族。在这项研究中,开发了一种两级预测器,称为 NR-2L,它可以根据序列信息单独识别查询蛋白是否为核受体;如果是,则预测将自动继续,以进一步在以下七个亚家族中识别:(1)甲状腺激素样(NR1),(2)HNF4 样(NR2),(3)雌激素样,(4)神经生长因子 IB 样(NR4),(5)fushi tarazu-F1 样(NR5),(6)生殖细胞核因子样(NR6)和(7)knirps 样(NR0)。通过模糊 K 最近邻(FK-NN)分类器进行识别,该分类器基于由蛋白质序列衍生的各种物理化学和统计特征形成的伪氨基酸组成,例如氨基酸组成、二肽组成、复杂度因子和低频傅里叶谱分量。作为演示,通过来自 NucleaRDB 和 UniProt 的一些低冗余基准数据集表明,在第一级和第二级的 jackknife 测试中,总体成功率分别约为 93%和 89%。高成功率表明,新型两级预测器可以成为识别 NRs 及其亚家族的有用工具。作为一个用户友好的网络服务器,NR-2L 可在以下网址免费访问:http://icpr.jci.edu.cn/bioinfo/NR2Lhttp://www.jci-bioinfo.cn/NR2L。提交给 NR-2L 的每个作业可以包含多达 500 个查询蛋白序列,并且可以在不到 2 分钟内完成。查询蛋白数量越少,通常时间越短。NR-2L 的所有程序代码均可在非商业用途的请求下提供。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e626/3156231/e6e06ce96e2f/pone.0023505.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验