• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于序列特征预测蛋白质无规则区域的新方法。

A novel method of predicting protein disordered regions based on sequence features.

机构信息

Institute of Systems Biology, Shanghai University, Shanghai 200444, China.

出版信息

Biomed Res Int. 2013;2013:414327. doi: 10.1155/2013/414327. Epub 2013 Apr 22.

DOI:10.1155/2013/414327
PMID:23710446
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3654632/
Abstract

With a large number of disordered proteins and their important functions discovered, it is highly desired to develop effective methods to computationally predict protein disordered regions. In this study, based on Random Forest (RF), Maximum Relevancy Minimum Redundancy (mRMR), and Incremental Feature Selection (IFS), we developed a new method to predict disordered regions in proteins. The mRMR criterion was used to rank the importance of all candidate features. Finally, top 128 features were selected from the ranked feature list to build the optimal model, including 92 Position Specific Scoring Matrix (PSSM) conservation score features and 36 secondary structure features. As a result, Matthews correlation coefficient (MCC) of 0.3895 was achieved on the training set by 10-fold cross-validation. On the basis of predicting results for each query sequence by using the method, we used the scanning and modification strategy to improve the performance. The accuracy (ACC) and MCC were increased by 4% and almost 0.2%, respectively, compared with other three popular predictors: DISOPRED, DISOclust, and OnD-CRF. The selected features may shed some light on the understanding of the formation mechanism of disordered structures, providing guidelines for experimental validation.

摘要

随着大量无序蛋白质及其重要功能的发现,人们非常希望开发有效的方法来计算预测蛋白质无序区域。在这项研究中,我们基于随机森林 (RF)、最大相关性最小冗余度 (mRMR) 和增量特征选择 (IFS),开发了一种新的预测蛋白质无序区域的方法。使用 mRMR 准则对所有候选特征的重要性进行排序。最后,从排序的特征列表中选择前 128 个特征来构建最优模型,包括 92 个位置特异性评分矩阵 (PSSM) 保守评分特征和 36 个二级结构特征。结果,通过 10 倍交叉验证,在训练集上获得了 0.3895 的马修斯相关系数 (MCC)。基于对每个查询序列的预测结果,我们使用扫描和修改策略来提高性能。与其他三个流行的预测器(DISOPRED、DISOclust 和 OnD-CRF)相比,准确性 (ACC) 和 MCC 分别提高了 4%和近 0.2%。选择的特征可能有助于理解无序结构的形成机制,为实验验证提供指导。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a31d/3654632/97643caed9a7/BMRI2013-414327.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a31d/3654632/f2b102b675eb/BMRI2013-414327.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a31d/3654632/a76281443c60/BMRI2013-414327.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a31d/3654632/1e91d4d12292/BMRI2013-414327.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a31d/3654632/97643caed9a7/BMRI2013-414327.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a31d/3654632/f2b102b675eb/BMRI2013-414327.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a31d/3654632/a76281443c60/BMRI2013-414327.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a31d/3654632/1e91d4d12292/BMRI2013-414327.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a31d/3654632/97643caed9a7/BMRI2013-414327.004.jpg

相似文献

1
A novel method of predicting protein disordered regions based on sequence features.基于序列特征预测蛋白质无规则区域的新方法。
Biomed Res Int. 2013;2013:414327. doi: 10.1155/2013/414327. Epub 2013 Apr 22.
2
A sequence-based approach for predicting protein disordered regions.一种基于序列的蛋白质无序区域预测方法。
Protein Pept Lett. 2013 Mar;20(3):243-8. doi: 10.2174/0929866511320030002.
3
Predicting A-to-I RNA editing by feature selection and random forest.通过特征选择和随机森林预测A到I RNA编辑
PLoS One. 2014 Oct 22;9(10):e110607. doi: 10.1371/journal.pone.0110607. eCollection 2014.
4
CrystalM: A Multi-View Fusion Approach for Protein Crystallization Prediction.CrystalM:一种用于蛋白质结晶预测的多视图融合方法。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Jan-Feb;18(1):325-335. doi: 10.1109/TCBB.2019.2912173. Epub 2021 Feb 3.
5
A method to distinguish between lysine acetylation and lysine ubiquitination with feature selection and analysis.一种通过特征选择和分析来区分赖氨酸乙酰化和赖氨酸泛素化的方法。
J Biomol Struct Dyn. 2015;33(11):2479-90. doi: 10.1080/07391102.2014.1001793. Epub 2015 Jan 23.
6
Prediction of Citrullination Sites on the Basis of mRMR Method and SNN.基于mRMR方法和SNN的瓜氨酸化位点预测
Comb Chem High Throughput Screen. 2019;22(10):705-715. doi: 10.2174/1386207322666191129113508.
7
Prediction of nuclear proteins using nuclear translocation signals proposed by probabilistic latent semantic indexing.基于概率潜在语义索引的核转位信号预测核蛋白。
BMC Bioinformatics. 2012;13 Suppl 17(Suppl 17):S13. doi: 10.1186/1471-2105-13-S17-S13. Epub 2012 Dec 13.
8
Computational Prediction of Protein Epsilon Lysine Acetylation Sites Based on a Feature Selection Method.基于特征选择方法的蛋白质ε-赖氨酸乙酰化位点的计算预测
Comb Chem High Throughput Screen. 2017;20(7):629-637. doi: 10.2174/1386207320666170314093216.
9
Prediction of protein modification sites of pyrrolidone carboxylic acid using mRMR feature selection and analysis.使用 mRMR 特征选择和分析预测吡咯烷酮羧酸的蛋白质修饰位点。
PLoS One. 2011;6(12):e28221. doi: 10.1371/journal.pone.0028221. Epub 2011 Dec 9.
10
Sequence-based predictor of ATP-binding residues using random forest and mRMR-IFS feature selection.基于序列的ATP结合残基预测器,采用随机森林和mRMR-IFS特征选择方法。
J Theor Biol. 2014 Nov 7;360:59-66. doi: 10.1016/j.jtbi.2014.06.037. Epub 2014 Jul 8.

引用本文的文献

1
A Novel Ensemble Learning-Based Computational Method to Predict Protein-Protein Interactions from Protein Primary Sequences.一种基于集成学习的新型计算方法,用于从蛋白质一级序列预测蛋白质-蛋白质相互作用。
Biology (Basel). 2022 May 19;11(5):775. doi: 10.3390/biology11050775.
2
The Methylation Pattern for Knee and Hip Osteoarthritis.膝关节和髋关节骨关节炎的甲基化模式
Front Cell Dev Biol. 2020 Nov 6;8:602024. doi: 10.3389/fcell.2020.602024. eCollection 2020.
3
Identification and Analysis of Glioblastoma Biomarkers Based on Single Cell Sequencing.

本文引用的文献

1
Natively unfolded proteins: An overview.天然未折叠蛋白:综述。
Biophysics (Nagoya-shi). 2009 Oct 21;5:53-58. doi: 10.2142/biophysics.5.53. eCollection 2009.
2
Prion domain of yeast Ure2 protein adopts a completely disordered structure: a solid-support EPR study.酵母 Ure2 蛋白的朊病毒结构域采用完全无序的结构:固相支持 EPR 研究。
PLoS One. 2012;7(10):e47248. doi: 10.1371/journal.pone.0047248. Epub 2012 Oct 16.
3
Aragonite-associated biomineralization proteins are disordered and contain interactive motifs.文石相相关的生物矿化蛋白是无规则的,并含有相互作用的基序。
基于单细胞测序的胶质母细胞瘤生物标志物的鉴定与分析
Front Bioeng Biotechnol. 2020 Mar 5;8:167. doi: 10.3389/fbioe.2020.00167. eCollection 2020.
4
The Functional Effects of Key Driver KRAS Mutations on Gene Expression in Lung Cancer.关键驱动基因KRAS突变对肺癌基因表达的功能影响
Front Genet. 2020 Feb 4;11:17. doi: 10.3389/fgene.2020.00017. eCollection 2020.
5
The Gene Expression Biomarkers for Chronic Obstructive Pulmonary Disease and Interstitial Lung Disease.慢性阻塞性肺疾病和间质性肺疾病的基因表达生物标志物
Front Genet. 2019 Nov 20;10:1154. doi: 10.3389/fgene.2019.01154. eCollection 2019.
6
The transcriptome difference between colorectal tumor and normal tissues revealed by single-cell sequencing.单细胞测序揭示的结直肠癌肿瘤组织与正常组织之间的转录组差异。
J Cancer. 2019 Oct 11;10(23):5883-5890. doi: 10.7150/jca.32267. eCollection 2019.
7
Identification of tumor-educated platelet biomarkers of non-small-cell lung cancer.非小细胞肺癌肿瘤诱导血小板生物标志物的鉴定
Onco Targets Ther. 2018 Nov 14;11:8143-8151. doi: 10.2147/OTT.S177384. eCollection 2018.
8
The early detection of asthma based on blood gene expression.基于血液基因表达的哮喘早期检测
Mol Biol Rep. 2019 Feb;46(1):217-223. doi: 10.1007/s11033-018-4463-6. Epub 2018 Nov 12.
9
Identification of the predictive genes for the response of colorectal cancer patients to FOLFOX therapy.鉴定结直肠癌患者对FOLFOX治疗反应的预测基因。
Onco Targets Ther. 2018 Sep 17;11:5943-5955. doi: 10.2147/OTT.S167656. eCollection 2018.
10
Identification and Analysis of Blood Gene Expression Signature for Osteoarthritis With Advanced Feature Selection Methods.使用先进特征选择方法对骨关节炎血液基因表达特征进行识别与分析。
Front Genet. 2018 Aug 30;9:246. doi: 10.3389/fgene.2018.00246. eCollection 2018.
Bioinformatics. 2012 Dec 15;28(24):3182-5. doi: 10.1093/bioinformatics/bts604. Epub 2012 Oct 11.
4
The intrinsically disordered N-terminal region of AtREM1.3 remorin protein mediates protein-protein interactions.AtREM1.3 蛋白的无序 N 端结构域介导蛋白-蛋白相互作用。
J Biol Chem. 2012 Nov 16;287(47):39982-91. doi: 10.1074/jbc.M112.414292. Epub 2012 Oct 1.
5
Prediction of protein domain with mRMR feature selection and analysis.基于 mRMR 特征选择的蛋白质结构域预测及分析。
PLoS One. 2012;7(6):e39308. doi: 10.1371/journal.pone.0039308. Epub 2012 Jun 15.
6
A sequence-based approach for predicting protein disordered regions.一种基于序列的蛋白质无序区域预测方法。
Protein Pept Lett. 2013 Mar;20(3):243-8. doi: 10.2174/0929866511320030002.
7
Predict and analyze S-nitrosylation modification sites with the mRMR and IFS approaches.采用 mRMR 和 IFS 方法预测和分析 S-亚硝基化修饰位点。
J Proteomics. 2012 Feb 16;75(5):1654-65. doi: 10.1016/j.jprot.2011.12.003. Epub 2011 Dec 11.
8
Intrinsic disorder of the extracellular matrix.细胞外基质的内在无序性
Mol Biosyst. 2011 Dec;7(12):3353-65. doi: 10.1039/c1mb05316g. Epub 2011 Oct 19.
9
Classification and analysis of regulatory pathways using graph property, biochemical and physicochemical property, and functional property.利用图属性、生化和物理化学性质以及功能性质对调控途径进行分类和分析。
PLoS One. 2011;6(9):e25297. doi: 10.1371/journal.pone.0025297. Epub 2011 Sep 28.
10
Cyclization of the intrinsically disordered α1S dihydropyridine receptor II-III loop enhances secondary structure and in vitro function.α1S 二氢吡啶受体 II-III 环内环化增强了二级结构和体外功能。
J Biol Chem. 2011 Jun 24;286(25):22589-99. doi: 10.1074/jbc.M110.205476. Epub 2011 Apr 27.