• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于序列的蛋白质结晶、纯化和生产倾向预测。

Sequence-based prediction of protein crystallization, purification and production propensity.

机构信息

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Canada.

出版信息

Bioinformatics. 2011 Jul 1;27(13):i24-33. doi: 10.1093/bioinformatics/btr229.

DOI:10.1093/bioinformatics/btr229
PMID:21685077
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3117383/
Abstract

MOTIVATION

X-ray crystallography-based protein structure determination, which accounts for majority of solved structures, is characterized by relatively low success rates. One solution is to build tools which support selection of targets that are more likely to crystallize. Several in silico methods that predict propensity of diffraction-quality crystallization from protein chains were developed. We show that the quality of their predictions drops when applied to more recent crystallization trails, which calls for new solutions. We propose a novel approach that alleviates drawbacks of the existing methods by using a recent dataset and improved protocol to annotate progress along the crystallization process, by predicting the success of the entire process and steps which result in the failed attempts, and by utilizing a compact and comprehensive set of sequence-derived inputs to generate accurate predictions.

RESULTS

The proposed PPCpred (predictor of protein Production, Purification and Crystallization) predict propensity for production of diffraction-quality crystals, production of crystals, purification and production of the protein material. PPCpred utilizes comprehensive set of inputs based on energy and hydrophobicity indices, composition of certain amino acid types, predicted disorder, secondary structure and solvent accessibility, and content of certain buried and exposed residues. Our method significantly outperforms alignment-based predictions and several modern crystallization propensity predictors. Receiver operating characteristic (ROC) curves show that PPCpred is particularly useful for users who desire high true positive (TP) rates, i.e. low rate of mispredictions for solvable chains. Our model reveals several intuitive factors that influence the success of individual steps and the entire crystallization process, including the content of Cys, buried His and Ser, hydrophobic/hydrophilic segments and the number of predicted disordered segments.

AVAILABILITY

http://biomine.ece.ualberta.ca/PPCpred/.

CONTACT

lkurgan@ece.ualberta.ca.

摘要

动机

基于 X 射线晶体学的蛋白质结构测定,占已解决结构的大多数,其成功率相对较低。一种解决方案是构建支持选择更有可能结晶的目标的工具。已经开发了几种从蛋白质链预测衍射质量结晶倾向的计算方法。我们表明,当将它们应用于最近的结晶试验时,它们的预测质量会下降,这需要新的解决方案。我们提出了一种新方法,通过使用最近的数据集和改进的方案来注释结晶过程的进展,通过预测整个过程的成功以及导致失败尝试的步骤,并利用紧凑而全面的基于序列的输入集来生成准确的预测,从而减轻现有方法的缺点。

结果

所提出的 PPCpred(蛋白质生产、纯化和结晶预测器)预测了生产衍射质量晶体、生产晶体、纯化和生产蛋白质材料的倾向。PPCpred 利用了基于能量和疏水性指数、某些氨基酸类型的组成、预测的无序、二级结构和溶剂可及性以及某些埋藏和暴露残基的含量的综合输入集。我们的方法显著优于基于比对的预测和几种现代结晶倾向预测器。接收者操作特征 (ROC) 曲线表明,PPCpred 对于希望获得高真阳性 (TP) 率的用户特别有用,即对于可解决链的错误预测率低。我们的模型揭示了几个直观的因素,这些因素会影响单个步骤和整个结晶过程的成功,包括 Cys、埋藏 His 和 Ser、疏水区/亲水区段以及预测的无序区段的数量。

可用性

http://biomine.ece.ualberta.ca/PPCpred/。

联系

lkurgan@ece.ualberta.ca。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428e/3117383/88fb3bcd98d3/btr229f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428e/3117383/02d32f718a7f/btr229f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428e/3117383/d3631154ab10/btr229f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428e/3117383/88fb3bcd98d3/btr229f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428e/3117383/02d32f718a7f/btr229f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428e/3117383/d3631154ab10/btr229f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/428e/3117383/88fb3bcd98d3/btr229f3.jpg

相似文献

1
Sequence-based prediction of protein crystallization, purification and production propensity.基于序列的蛋白质结晶、纯化和生产倾向预测。
Bioinformatics. 2011 Jul 1;27(13):i24-33. doi: 10.1093/bioinformatics/btr229.
2
PredPPCrys: accurate prediction of sequence cloning, protein production, purification and crystallization propensity from protein sequences using multi-step heterogeneous feature fusion and selection.PredPPCrys:利用多步异构特征融合与选择从蛋白质序列准确预测序列克隆、蛋白质生产、纯化及结晶倾向。
PLoS One. 2014 Aug 22;9(8):e105902. doi: 10.1371/journal.pone.0105902. eCollection 2014.
3
CRYSpred: accurate sequence-based protein crystallization propensity prediction using sequence-derived structural characteristics.CRYSpred:利用基于序列的结构特征进行准确的基于序列的蛋白质结晶倾向预测。
Protein Pept Lett. 2012 Jan;19(1):40-9. doi: 10.2174/092986612798472910.
4
CRYSTALP2: sequence-based protein crystallization propensity prediction.CRYSTALP2:基于序列的蛋白质结晶倾向预测
BMC Struct Biol. 2009 Jul 31;9:50. doi: 10.1186/1472-6807-9-50.
5
fDETECT webserver: fast predictor of propensity for protein production, purification, and crystallization.fDETECT 网页服务器:快速预测蛋白质生产、纯化和结晶的倾向。
BMC Bioinformatics. 2018 Jan 3;18(1):580. doi: 10.1186/s12859-017-1995-z.
6
DFLpred: High-throughput prediction of disordered flexible linker regions in protein sequences.DFLpred:蛋白质序列中无序柔性连接区的高通量预测
Bioinformatics. 2016 Jun 15;32(12):i341-i350. doi: 10.1093/bioinformatics/btw280.
7
Accurate multistage prediction of protein crystallization propensity using deep-cascade forest with sequence-based features.使用基于序列特征的深度级联森林对蛋白质结晶倾向进行准确的多阶段预测。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa076.
8
MoRFpred, a computational tool for sequence-based prediction and characterization of short disorder-to-order transitioning binding regions in proteins.MoRFpred,一种基于序列的计算工具,用于预测和描述蛋白质中短的无序到有序转变的结合区域。
Bioinformatics. 2012 Jun 15;28(12):i75-83. doi: 10.1093/bioinformatics/bts209.
9
RFCRYS: sequence-based protein crystallization propensity prediction by means of random forest.RFCRYS:基于序列的蛋白质结晶倾向预测的随机森林方法。
J Theor Biol. 2012 Aug 7;306:115-9. doi: 10.1016/j.jtbi.2012.04.028. Epub 2012 May 2.
10
Meta prediction of protein crystallization propensity.蛋白质结晶倾向的元预测
Biochem Biophys Res Commun. 2009 Dec 4;390(1):10-5. doi: 10.1016/j.bbrc.2009.09.036. Epub 2009 Sep 13.

引用本文的文献

1
Recent Advances in Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences.从蛋白质序列预测二级和超二级结构的计算方法的最新进展
Methods Mol Biol. 2025;2870:1-19. doi: 10.1007/978-1-0716-4213-9_1.
2
PLMC: Language Model of Protein Sequences Enhances Protein Crystallization Prediction.PLMC:蛋白质序列的语言模型增强蛋白质结晶预测。
Interdiscip Sci. 2024 Dec;16(4):802-813. doi: 10.1007/s12539-024-00639-6. Epub 2024 Aug 19.
3
Deep learning applications in protein crystallography.深度学习在蛋白质晶体学中的应用。

本文引用的文献

1
Predicting protein crystallization propensity from protein sequence.从蛋白质序列预测蛋白质结晶倾向。
J Struct Funct Genomics. 2010 Mar;11(1):71-80. doi: 10.1007/s10969-010-9080-0. Epub 2010 Feb 23.
2
SVMCRYS: an SVM approach for the prediction of protein crystallization propensity from protein sequence.SVMCRYS:一种基于支持向量机的从蛋白质序列预测蛋白质结晶倾向的方法。
Protein Pept Lett. 2010 Apr;17(4):423-30. doi: 10.2174/092986610790963726.
3
Meta prediction of protein crystallization propensity.蛋白质结晶倾向的元预测
Acta Crystallogr A Found Adv. 2024 Jan 1;80(Pt 1):1-17. doi: 10.1107/S2053273323009300.
4
The Cytotoxic Mycobacteriophage Protein Phaedrus gp82 Interacts with and Modulates the Activity of the Host ATPase, MoxR.细胞毒性分枝杆菌噬菌体蛋白 Phaedrus gp82 与宿主 ATP 酶 MoxR 相互作用并调节其活性。
J Mol Biol. 2023 Oct 15;435(20):168261. doi: 10.1016/j.jmb.2023.168261. Epub 2023 Sep 9.
5
Performance of Novel Antimicrobial Protein Bg_9562 and In Silico Predictions on Its Properties with Reference to Its Antimicrobial Efficiency against .新型抗菌蛋白Bg_9562的性能及其抗菌效率相关特性的计算机模拟预测
Antibiotics (Basel). 2022 Mar 8;11(3):363. doi: 10.3390/antibiotics11030363.
6
TLCrys: Transfer Learning Based Method for Protein Crystallization Prediction.TLCrys:基于迁移学习的蛋白质结晶预测方法。
Int J Mol Sci. 2022 Jan 16;23(2):972. doi: 10.3390/ijms23020972.
7
Sequence-Based Prediction of Transmembrane Protein Crystallization Propensity.基于序列的跨膜蛋白结晶倾向预测。
Interdiscip Sci. 2021 Dec;13(4):693-702. doi: 10.1007/s12539-021-00448-1. Epub 2021 Jun 18.
8
Computational drug re-purposing targeting the spike glycoprotein of SARS-CoV-2 as an effective strategy to neutralize COVID-19.针对 SARS-CoV-2 刺突糖蛋白的计算药物再利用是中和 COVID-19 的有效策略。
Eur J Pharmacol. 2021 Jan 5;890:173720. doi: 10.1016/j.ejphar.2020.173720. Epub 2020 Nov 6.
9
Identification of mycobacteriophage toxic genes reveals new features of mycobacterial physiology and morphology.鉴定分枝杆菌噬菌体毒性基因揭示了分枝杆菌生理学和形态学的新特征。
Sci Rep. 2020 Sep 4;10(1):14670. doi: 10.1038/s41598-020-71588-5.
10
Computational Prediction of Intrinsic Disorder in Protein Sequences with the disCoP Meta-predictor.利用 disCoP 元预测器对蛋白质序列中的固有无序性进行计算预测。
Methods Mol Biol. 2020;2141:21-35. doi: 10.1007/978-1-0716-0524-0_2.
Biochem Biophys Res Commun. 2009 Dec 4;390(1):10-5. doi: 10.1016/j.bbrc.2009.09.036. Epub 2009 Sep 13.
4
CRYSTALP2: sequence-based protein crystallization propensity prediction.CRYSTALP2:基于序列的蛋白质结晶倾向预测
BMC Struct Biol. 2009 Jul 31;9:50. doi: 10.1186/1472-6807-9-50.
5
PSI-2: structural genomics to cover protein domain family space.PSI-2:用于覆盖蛋白质结构域家族空间的结构基因组学。
Structure. 2009 Jun 10;17(6):869-81. doi: 10.1016/j.str.2009.03.015.
6
Understanding the physical properties that control protein crystallization by analysis of large-scale experimental data.通过分析大规模实验数据来理解控制蛋白质结晶的物理性质。
Nat Biotechnol. 2009 Jan;27(1):51-7. doi: 10.1038/nbt.1514.
7
Improving the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins by guided-learning through a two-layer neural network.通过两层神经网络的引导学习提高蛋白质残基溶剂可及性和实值主链扭转角的预测准确性。
Proteins. 2009 Mar;74(4):847-56. doi: 10.1002/prot.22193.
8
ParCrys: a Parzen window density estimation approach to protein crystallization propensity prediction.ParCrys:一种用于蛋白质结晶倾向预测的Parzen窗密度估计方法。
Bioinformatics. 2008 Apr 1;24(7):901-7. doi: 10.1093/bioinformatics/btn055. Epub 2008 Feb 19.
9
AAindex: amino acid index database, progress report 2008.AAindex:氨基酸索引数据库,2008年进展报告。
Nucleic Acids Res. 2008 Jan;36(Database issue):D202-5. doi: 10.1093/nar/gkm998. Epub 2007 Nov 12.
10
The challenge of protein structure determination--lessons from structural genomics.蛋白质结构测定的挑战——来自结构基因组学的经验教训。
Protein Sci. 2007 Nov;16(11):2472-82. doi: 10.1110/ps.073037907.