• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CRYSTALP2:基于序列的蛋白质结晶倾向预测

CRYSTALP2: sequence-based protein crystallization propensity prediction.

作者信息

Kurgan Lukasz, Razib Ali A, Aghakhani Sara, Dick Scott, Mizianty Marcin, Jahandideh Samad

机构信息

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada.

出版信息

BMC Struct Biol. 2009 Jul 31;9:50. doi: 10.1186/1472-6807-9-50.

DOI:10.1186/1472-6807-9-50
PMID:19646256
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2731098/
Abstract

BACKGROUND

Current protocols yield crystals for <30% of known proteins, indicating that automatically identifying crystallizable proteins may improve high-throughput structural genomics efforts. We introduce CRYSTALP2, a kernel-based method that predicts the propensity of a given protein sequence to produce diffraction-quality crystals. This method utilizes the composition and collocation of amino acids, isoelectric point, and hydrophobicity, as estimated from the primary sequence, to generate predictions. CRYSTALP2 extends its predecessor, CRYSTALP, by enabling predictions for sequences of unrestricted size and provides improved prediction quality.

RESULTS

A significant majority of the collocations used by CRYSTALP2 include residues with high conformational entropy, or low entropy and high potential to mediate crystal contacts; notably, such residues are utilized by surface entropy reduction methods. We show that the collocations provide complementary information to the hydrophobicity and isoelectric point. Tests on four datasets show that CRYSTALP2 outperforms several existing sequence-based predictors (CRYSTALP, OB-score, and SECRET). CRYSTALP2's accuracy, MCC, and AROC range between 69.3 and 77.5%, 0.39 and 0.55, and 0.72 and 0.79, respectively. Our predictions are similar in quality and are complementary to the predictions of the most recent ParCrys and XtalPred methods. Our results also suggest that, as work in protein crystallization continues (thereby enlarging the population of proteins with known crystallization propensities), the prediction quality of the CRYSTALP2 method should increase. The prediction model and the datasets used in this contribution can be downloaded from http://biomine.ece.ualberta.ca/CRYSTALP2/CRYSTALP2.html.

CONCLUSION

CRYSTALP2 provides relatively accurate crystallization propensity predictions for a given protein chain that either outperform or complement the existing approaches. The proposed method can be used to support current efforts towards improving the success rate in obtaining diffraction-quality crystals.

摘要

背景

当前的方案仅能为不到30%的已知蛋白质生成晶体,这表明自动识别可结晶蛋白质可能会提高高通量结构基因组学的研究效率。我们引入了CRYSTALP2,这是一种基于核的方法,可预测给定蛋白质序列产生衍射质量晶体的倾向。该方法利用从一级序列估计的氨基酸组成和搭配、等电点以及疏水性来进行预测。CRYSTALP2在其前身CRYSTALP的基础上进行了扩展,能够对任意长度的序列进行预测,并提高了预测质量。

结果

CRYSTALP2使用的绝大多数搭配包含具有高构象熵的残基,或低熵且具有高潜力介导晶体接触的残基;值得注意的是,此类残基被表面熵降低方法所利用。我们表明,这些搭配为疏水性和等电点提供了补充信息。在四个数据集上的测试表明,CRYSTALP2优于几种现有的基于序列的预测器(CRYSTALP、OB评分和SECRET)。CRYSTALP2的准确率、马修斯相关系数(MCC)和曲线下面积(AROC)分别在69.3%至77.5%、0.39至0.55以及0.72至0.79之间。我们的预测质量相似,并且与最新的ParCrys和XtalPred方法的预测互补。我们的结果还表明,随着蛋白质结晶研究的继续(从而扩大具有已知结晶倾向的蛋白质群体),CRYSTALP2方法的预测质量应该会提高。本研究中使用的预测模型和数据集可从http://biomine.ece.ualberta.ca/CRYSTALP2/CRYSTALP2.html下载。

结论

CRYSTALP2为给定的蛋白质链提供了相对准确的结晶倾向预测,其性能优于或补充了现有方法。所提出的方法可用于支持当前提高获得衍射质量晶体成功率方面的工作。

相似文献

1
CRYSTALP2: sequence-based protein crystallization propensity prediction.CRYSTALP2:基于序列的蛋白质结晶倾向预测
BMC Struct Biol. 2009 Jul 31;9:50. doi: 10.1186/1472-6807-9-50.
2
Meta prediction of protein crystallization propensity.蛋白质结晶倾向的元预测
Biochem Biophys Res Commun. 2009 Dec 4;390(1):10-5. doi: 10.1016/j.bbrc.2009.09.036. Epub 2009 Sep 13.
3
ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction.ParCrys:一种用于蛋白质结晶倾向预测的Parzen窗密度估计方法。
Bioinformatics. 2008 Apr 1;24(7):901-7. doi: 10.1093/bioinformatics/btn055. Epub 2008 Feb 19.
4
SVMCRYS: an SVM approach for the prediction of protein crystallization propensity from protein sequence.SVMCRYS:一种基于支持向量机的从蛋白质序列预测蛋白质结晶倾向的方法。
Protein Pept Lett. 2010 Apr;17(4):423-30. doi: 10.2174/092986610790963726.
5
Sequence-based prediction of protein crystallization, purification and production propensity.基于序列的蛋白质结晶、纯化和生产倾向预测。
Bioinformatics. 2011 Jul 1;27(13):i24-33. doi: 10.1093/bioinformatics/btr229.
6
RFCRYS: sequence-based protein crystallization propensity prediction by means of random forest.RFCRYS:基于序列的蛋白质结晶倾向预测的随机森林方法。
J Theor Biol. 2012 Aug 7;306:115-9. doi: 10.1016/j.jtbi.2012.04.028. Epub 2012 May 2.
7
XANNpred: neural nets that predict the propensity of a protein to yield diffraction-quality crystals.XANNpred:预测蛋白质产生衍射质量晶体倾向的神经网络。
Proteins. 2011 Apr;79(4):1027-33. doi: 10.1002/prot.22914. Epub 2011 Jan 18.
8
Prediction of protein crystallization using collocation of amino acid pairs.利用氨基酸对的搭配预测蛋白质结晶
Biochem Biophys Res Commun. 2007 Apr 13;355(3):764-9. doi: 10.1016/j.bbrc.2007.02.040. Epub 2007 Feb 15.
9
Improving the chances of successful protein structure determination with a random forest classifier.利用随机森林分类器提高蛋白质结构测定成功的几率。
Acta Crystallogr D Biol Crystallogr. 2014 Mar;70(Pt 3):627-35. doi: 10.1107/S1399004713032070. Epub 2014 Feb 15.
10
PredPPCrys: accurate prediction of sequence cloning, protein production, purification and crystallization propensity from protein sequences using multi-step heterogeneous feature fusion and selection.PredPPCrys:利用多步异构特征融合与选择从蛋白质序列准确预测序列克隆、蛋白质生产、纯化及结晶倾向。
PLoS One. 2014 Aug 22;9(8):e105902. doi: 10.1371/journal.pone.0105902. eCollection 2014.

引用本文的文献

1
Benchmarking protein language models for protein crystallization.用于蛋白质结晶的蛋白质语言模型基准测试。
Sci Rep. 2025 Jan 18;15(1):2381. doi: 10.1038/s41598-025-86519-5.
2
Novel enzymes for biodegradation of polycyclic aromatic hydrocarbons identified by metagenomics and functional analysis in short-term soil microcosm experiments.通过宏基因组学和短期土壤微宇宙实验中的功能分析鉴定用于多环芳烃生物降解的新型酶。
Sci Rep. 2024 May 21;14(1):11608. doi: 10.1038/s41598-024-61566-6.
3
Deep learning applications in protein crystallography.深度学习在蛋白质晶体学中的应用。

本文引用的文献

1
Regulation of T cell receptor activation by dynamic membrane binding of the CD3epsilon cytoplasmic tyrosine-based motif.通过CD3ε基于细胞质酪氨酸基序的动态膜结合对T细胞受体激活的调节。
Cell. 2008 Nov 14;135(4):702-13. doi: 10.1016/j.cell.2008.09.044.
2
Prediction of integral membrane protein type by collocated hydrophobic amino acid pairs.通过并置疏水氨基酸对预测整合膜蛋白类型
J Comput Chem. 2009 Jan 15;30(1):163-72. doi: 10.1002/jcc.21053.
3
Recent structural and computational insights into conformational diseases.近期关于构象疾病的结构与计算研究进展
Acta Crystallogr A Found Adv. 2024 Jan 1;80(Pt 1):1-17. doi: 10.1107/S2053273323009300.
4
A workflow for the development of template-assisted membrane crystallization downstream processing for monoclonal antibody purification.用于单克隆抗体纯化的模板辅助膜结晶下游处理的开发工作流程。
Nat Protoc. 2023 Oct;18(10):2998-3049. doi: 10.1038/s41596-023-00869-w. Epub 2023 Sep 11.
5
First Data on Aquaporins: Structural, Phylogenetic and Immunogenic Characterisation as Vaccine Targets.水通道蛋白的初步数据:作为疫苗靶点的结构、系统发育和免疫原性特征
Pathogens. 2022 Jun 16;11(6):694. doi: 10.3390/pathogens11060694.
6
Empirical comparison and analysis of machine learning-based predictors for predicting and analyzing of thermophilic proteins.用于预测和分析嗜热蛋白的基于机器学习的预测器的实证比较与分析
EXCLI J. 2022 Mar 2;21:554-570. doi: 10.17179/excli2022-4723. eCollection 2022.
7
Performance of Novel Antimicrobial Protein Bg_9562 and In Silico Predictions on Its Properties with Reference to Its Antimicrobial Efficiency against .新型抗菌蛋白Bg_9562的性能及其抗菌效率相关特性的计算机模拟预测
Antibiotics (Basel). 2022 Mar 8;11(3):363. doi: 10.3390/antibiotics11030363.
8
TLCrys: Transfer Learning Based Method for Protein Crystallization Prediction.TLCrys:基于迁移学习的蛋白质结晶预测方法。
Int J Mol Sci. 2022 Jan 16;23(2):972. doi: 10.3390/ijms23020972.
9
A roadmap for metagenomic enzyme discovery.宏基因组酶发现的路线图。
Nat Prod Rep. 2021 Nov 17;38(11):1994-2023. doi: 10.1039/d1np00006c.
10
Sequence-Based Prediction of Transmembrane Protein Crystallization Propensity.基于序列的跨膜蛋白结晶倾向预测。
Interdiscip Sci. 2021 Dec;13(4):693-702. doi: 10.1007/s12539-021-00448-1. Epub 2021 Jun 18.
Curr Med Chem. 2008;15(13):1336-49. doi: 10.2174/092986708784534938.
4
ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction.ParCrys:一种用于蛋白质结晶倾向预测的Parzen窗密度估计方法。
Bioinformatics. 2008 Apr 1;24(7):901-7. doi: 10.1093/bioinformatics/btn055. Epub 2008 Feb 19.
5
Prediction of mucin-type O-glycosylation sites in mammalian proteins using the composition of k-spaced amino acid pairs.利用k间隔氨基酸对的组成预测哺乳动物蛋白质中的粘蛋白型O-糖基化位点
BMC Bioinformatics. 2008 Feb 18;9:101. doi: 10.1186/1471-2105-9-101.
6
Structure and mechanism of the M2 proton channel of influenza A virus.甲型流感病毒M2质子通道的结构与机制
Nature. 2008 Jan 31;451(7178):591-5. doi: 10.1038/nature06531.
7
The challenge of protein structure determination--lessons from structural genomics.蛋白质结构测定的挑战——来自结构基因组学的经验教训。
Protein Sci. 2007 Nov;16(11):2472-82. doi: 10.1110/ps.073037907.
8
XtalPred: a web server for prediction of protein crystallizability.XtalPred:一个用于预测蛋白质结晶性的网络服务器。
Bioinformatics. 2007 Dec 15;23(24):3403-5. doi: 10.1093/bioinformatics/btm477. Epub 2007 Oct 5.
9
Recent progress in protein subcellular location prediction.蛋白质亚细胞定位预测的最新进展。
Anal Biochem. 2007 Nov 1;370(1):1-16. doi: 10.1016/j.ab.2007.07.006. Epub 2007 Jul 12.
10
Toward rational protein crystallization: A Web server for the design of crystallizable protein variants.迈向理性蛋白质结晶:用于设计可结晶蛋白质变体的网络服务器。
Protein Sci. 2007 Aug;16(8):1569-76. doi: 10.1110/ps.072914007.