• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质结晶倾向预测的生物信息学工具的批判性评估。

Critical evaluation of bioinformatics tools for the prediction of protein crystallization propensity.

机构信息

Department of Chemical Biology, College of Chemistry and Chemical Engineering, Xiamen University, China.

NMR Center, Xiamen University, China.

出版信息

Brief Bioinform. 2018 Sep 28;19(5):838-852. doi: 10.1093/bib/bbx018.

DOI:10.1093/bib/bbx018
PMID:28334201
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6171492/
Abstract

X-ray crystallography is the main tool for structural determination of proteins. Yet, the underlying crystallization process is costly, has a high attrition rate and involves a series of trial-and-error attempts to obtain diffraction-quality crystals. The Structural Genomics Consortium aims to systematically solve representative structures of major protein-fold classes using primarily high-throughput X-ray crystallography. The attrition rate of these efforts can be improved by selection of proteins that are potentially easier to be crystallized. In this context, bioinformatics approaches have been developed to predict crystallization propensities based on protein sequences. These approaches are used to facilitate prioritization of the most promising target proteins, search for alternative structural orthologues of the target proteins and suggest designs of constructs capable of potentially enhancing the likelihood of successful crystallization. We reviewed and compared nine predictors of protein crystallization propensity. Moreover, we demonstrated that integrating selected outputs from multiple predictors as candidate input features to build the predictive model results in a significantly higher predictive performance when compared to using these predictors individually. Furthermore, we also introduced a new and accurate predictor of protein crystallization propensity, Crysf, which uses functional features extracted from UniProt as inputs. This comprehensive review will assist structural biologists in selecting the most appropriate predictor, and is also beneficial for bioinformaticians to develop a new generation of predictive algorithms.

摘要

X 射线晶体学是蛋白质结构测定的主要工具。然而,基础的结晶过程成本高昂、淘汰率高,并且需要进行一系列反复尝试才能获得具有衍射质量的晶体。结构基因组学联盟旨在使用主要的高通量 X 射线晶体学系统地解决主要蛋白质折叠类别的代表性结构。通过选择潜在更容易结晶的蛋白质,可以提高这些努力的淘汰率。在这种情况下,已经开发了基于蛋白质序列预测结晶倾向的生物信息学方法。这些方法用于促进最有前途的靶蛋白的优先级排序,寻找靶蛋白的替代结构同源物,并建议设计能够潜在提高结晶成功率的构建体。我们回顾和比较了九个蛋白质结晶倾向预测器。此外,我们证明,将多个预测器的选定输出集成作为候选输入特征来构建预测模型,与单独使用这些预测器相比,可显著提高预测性能。此外,我们还引入了一个新的、准确的蛋白质结晶倾向预测器 Crysf,它使用从 UniProt 提取的功能特征作为输入。本综述将有助于结构生物学家选择最合适的预测器,也有利于生物信息学家开发新一代预测算法。

相似文献

1
Critical evaluation of bioinformatics tools for the prediction of protein crystallization propensity.蛋白质结晶倾向预测的生物信息学工具的批判性评估。
Brief Bioinform. 2018 Sep 28;19(5):838-852. doi: 10.1093/bib/bbx018.
2
PredPPCrys: accurate prediction of sequence cloning, protein production, purification and crystallization propensity from protein sequences using multi-step heterogeneous feature fusion and selection.PredPPCrys:利用多步异构特征融合与选择从蛋白质序列准确预测序列克隆、蛋白质生产、纯化及结晶倾向。
PLoS One. 2014 Aug 22;9(8):e105902. doi: 10.1371/journal.pone.0105902. eCollection 2014.
3
Accurate multistage prediction of protein crystallization propensity using deep-cascade forest with sequence-based features.使用基于序列特征的深度级联森林对蛋白质结晶倾向进行准确的多阶段预测。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa076.
4
Survey of Predictors of Propensity for Protein Production and Crystallization with Application to Predict Resolution of Crystal Structures.蛋白质产生和结晶倾向预测因子的调查及其在预测晶体结构分辨率中的应用
Curr Protein Pept Sci. 2018;19(2):200-210. doi: 10.2174/1389203718666170921114437.
5
fDETECT webserver: fast predictor of propensity for protein production, purification, and crystallization.fDETECT 网页服务器:快速预测蛋白质生产、纯化和结晶的倾向。
BMC Bioinformatics. 2018 Jan 3;18(1):580. doi: 10.1186/s12859-017-1995-z.
6
CLPred: a sequence-based protein crystallization predictor using BLSTM neural network.CLPred:一种基于序列的蛋白质结晶预测器,使用 BLSTM 神经网络。
Bioinformatics. 2020 Dec 30;36(Suppl_2):i709-i717. doi: 10.1093/bioinformatics/btaa791.
7
Computational approaches to selecting and optimising targets for structural biology.计算方法在结构生物学中用于选择和优化靶标。
Methods. 2011 Sep;55(1):3-11. doi: 10.1016/j.ymeth.2011.08.014. Epub 2011 Aug 27.
8
DeepCrystal: a deep learning framework for sequence-based protein crystallization prediction.DeepCrystal:一个基于深度学习的序列蛋白质结晶预测框架。
Bioinformatics. 2019 Jul 1;35(13):2216-2225. doi: 10.1093/bioinformatics/bty953.
9
RFCRYS: sequence-based protein crystallization propensity prediction by means of random forest.RFCRYS:基于序列的蛋白质结晶倾向预测的随机森林方法。
J Theor Biol. 2012 Aug 7;306:115-9. doi: 10.1016/j.jtbi.2012.04.028. Epub 2012 May 2.
10
GCmapCrys: Integrating graph attention network with predicted contact map for multi-stage protein crystallization propensity prediction.GCmapCrys:将图注意力网络与预测的接触图相结合用于多阶段蛋白质结晶倾向预测。
Anal Biochem. 2023 Feb 15;663:115020. doi: 10.1016/j.ab.2022.115020. Epub 2022 Dec 12.

引用本文的文献

1
Increased preference for lysine over arginine in spike proteins of SARS-CoV-2 BA.2.86 variant and its daughter lineages.在新冠病毒BA.2.86变体及其子代谱系的刺突蛋白中,赖氨酸相对于精氨酸的偏好增加。
PLoS One. 2025 Apr 7;20(4):e0320891. doi: 10.1371/journal.pone.0320891. eCollection 2025.
2
PLMC: Language Model of Protein Sequences Enhances Protein Crystallization Prediction.PLMC:蛋白质序列的语言模型增强蛋白质结晶预测。
Interdiscip Sci. 2024 Dec;16(4):802-813. doi: 10.1007/s12539-024-00639-6. Epub 2024 Aug 19.
3
Sequence-Based Prediction of Transmembrane Protein Crystallization Propensity.基于序列的跨膜蛋白结晶倾向预测。
Interdiscip Sci. 2021 Dec;13(4):693-702. doi: 10.1007/s12539-021-00448-1. Epub 2021 Jun 18.
4
Protein X-ray Crystallography and Drug Discovery.蛋白质 X 射线晶体学与药物发现。
Molecules. 2020 Feb 25;25(5):1030. doi: 10.3390/molecules25051030.
5
The Dundee Resource for Sequence Analysis and Structure Prediction.邓迪序列分析与结构预测资源库。
Protein Sci. 2020 Jan;29(1):277-297. doi: 10.1002/pro.3783. Epub 2019 Nov 28.
6
BCrystal: an interpretable sequence-based protein crystallization predictor.BCrystal:一种可解释的基于序列的蛋白质结晶预测器。
Bioinformatics. 2020 Mar 1;36(5):1429-1438. doi: 10.1093/bioinformatics/btz762.

本文引用的文献

1
Correcting the record of structural publications requires joint effort of the community and journal editors.纠正结构性出版物的记录需要学界和期刊编辑的共同努力。
FEBS J. 2016 Dec;283(24):4452-4457. doi: 10.1111/febs.13765. Epub 2016 Jun 10.
2
Recombinant preparation and functional studies of EspI ATP binding domain from Mycobacterium tuberculosis.结核分枝杆菌EspI ATP结合结构域的重组制备及功能研究
Protein Expr Purif. 2016 Jul;123:51-9. doi: 10.1016/j.pep.2016.03.009. Epub 2016 Mar 25.
3
The impact of structural genomics: the first quindecennial.结构基因组学的影响:首个十五年。
J Struct Funct Genomics. 2016 Mar;17(1):1-16. doi: 10.1007/s10969-016-9201-5. Epub 2016 Mar 2.
4
Crysalis: an integrated server for computational analysis and design of protein crystallization.Crysalis:用于蛋白质结晶计算分析与设计的集成服务器。
Sci Rep. 2016 Feb 24;6:21383. doi: 10.1038/srep21383.
5
Lessons from ten years of crystallization experiments at the SGC.十年 SGC 结晶实验的经验教训。
Acta Crystallogr D Struct Biol. 2016 Feb;72(Pt 2):224-35. doi: 10.1107/S2059798315024687. Epub 2016 Jan 22.
6
Safeguarding Structural Data Repositories against Bad Apples.保护结构数据存储库免受不良行为者的侵害。
Structure. 2016 Feb 2;24(2):216-20. doi: 10.1016/j.str.2015.12.010.
7
GlycoMine: a machine learning-based approach for predicting N-, C- and O-linked glycosylation in the human proteome.糖基分析软件(GlycoMine):一种基于机器学习的方法,用于预测人类蛋白质组中的 N-、C-和 O-糖基化。
Bioinformatics. 2015 May 1;31(9):1411-9. doi: 10.1093/bioinformatics/btu852. Epub 2015 Jan 6.
8
Covering complete proteomes with X-ray structures: a current snapshot.用X射线结构覆盖完整蛋白质组:当前概况
Acta Crystallogr D Biol Crystallogr. 2014 Nov;70(Pt 11):2781-93. doi: 10.1107/S1399004714019427. Epub 2014 Oct 23.
9
UniProt: a hub for protein information.通用蛋白质数据库(UniProt):蛋白质信息中心。
Nucleic Acids Res. 2015 Jan;43(Database issue):D204-12. doi: 10.1093/nar/gku989. Epub 2014 Oct 27.
10
PredPPCrys: accurate prediction of sequence cloning, protein production, purification and crystallization propensity from protein sequences using multi-step heterogeneous feature fusion and selection.PredPPCrys:利用多步异构特征融合与选择从蛋白质序列准确预测序列克隆、蛋白质生产、纯化及结晶倾向。
PLoS One. 2014 Aug 22;9(8):e105902. doi: 10.1371/journal.pone.0105902. eCollection 2014.