• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用神经注意力洞察蛋白质溶解度驱动力。

Insight into the protein solubility driving forces with neural attention.

机构信息

ESAT-STADIUS, KU Leuven, Leuven, Belgium.

SWITCH Lab, KU Leuven, Leuven, Belgium.

出版信息

PLoS Comput Biol. 2020 Apr 30;16(4):e1007722. doi: 10.1371/journal.pcbi.1007722. eCollection 2020 Apr.

DOI:10.1371/journal.pcbi.1007722
PMID:32352965
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7217484/
Abstract

Protein solubility is a key aspect for many biotechnological, biomedical and industrial processes, such as the production of active proteins and antibodies. In addition, understanding the molecular determinants of the solubility of proteins may be crucial to shed light on the molecular mechanisms of diseases caused by aggregation processes such as amyloidosis. Here we present SKADE, a novel Neural Network protein solubility predictor and we show how it can provide novel insight into the protein solubility mechanisms, thanks to its neural attention architecture. First, we show that SKADE positively compares with state of the art tools while using just the protein sequence as input. Then, thanks to the neural attention mechanism, we use SKADE to investigate the patterns learned during training and we analyse its decision process. We use this peculiarity to show that, while the attention profiles do not correlate with obvious sequence aspects such as biophysical properties of the aminoacids, they suggest that N- and C-termini are the most relevant regions for solubility prediction and are predictive for complex emergent properties such as aggregation-prone regions involved in beta-amyloidosis and contact density. Moreover, SKADE is able to identify mutations that increase or decrease the overall solubility of the protein, allowing it to be used to perform large scale in-silico mutagenesis of proteins in order to maximize their solubility.

摘要

蛋白质溶解性是许多生物技术、生物医学和工业过程的关键方面,例如活性蛋白和抗体的生产。此外,了解蛋白质溶解性的分子决定因素对于阐明由聚集过程(如淀粉样变性)引起的疾病的分子机制可能至关重要。在这里,我们介绍了 SKADE,一种新型神经网络蛋白质溶解性预测器,并且展示了由于其神经注意力架构,它如何为蛋白质溶解性机制提供新的见解。首先,我们表明,在仅使用蛋白质序列作为输入的情况下,SKADE 与最先进的工具相比具有积极的优势。然后,借助神经注意力机制,我们使用 SKADE 来研究训练过程中学习到的模式,并分析其决策过程。我们利用这一特点表明,尽管注意力分布与氨基酸的生物物理性质等明显的序列方面没有相关性,但它们表明 N-和 C-末端是对溶解性预测最相关的区域,并且对β淀粉样变性和接触密度等复杂的新兴特性具有预测能力。此外,SKADE 能够识别增加或降低蛋白质整体溶解性的突变,从而可以用于对蛋白质进行大规模的计算机模拟诱变,以最大限度地提高其溶解性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/119b/7217484/6805f794802c/pcbi.1007722.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/119b/7217484/6ad25f8db0cc/pcbi.1007722.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/119b/7217484/db21f4714773/pcbi.1007722.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/119b/7217484/7aeeffad51e9/pcbi.1007722.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/119b/7217484/485d37a506d5/pcbi.1007722.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/119b/7217484/6805f794802c/pcbi.1007722.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/119b/7217484/6ad25f8db0cc/pcbi.1007722.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/119b/7217484/db21f4714773/pcbi.1007722.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/119b/7217484/7aeeffad51e9/pcbi.1007722.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/119b/7217484/485d37a506d5/pcbi.1007722.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/119b/7217484/6805f794802c/pcbi.1007722.g005.jpg

相似文献

1
Insight into the protein solubility driving forces with neural attention.用神经注意力洞察蛋白质溶解度驱动力。
PLoS Comput Biol. 2020 Apr 30;16(4):e1007722. doi: 10.1371/journal.pcbi.1007722. eCollection 2020 Apr.
2
AGGRESCAN3D: Toward the Prediction of the Aggregation Propensities of Protein Structures.AGGRESCAN3D:迈向蛋白质结构聚集倾向预测
Methods Mol Biol. 2018;1762:427-443. doi: 10.1007/978-1-4939-7756-7_21.
3
GATSol, an enhanced predictor of protein solubility through the synergy of 3D structure graph and large language modeling.GATSol,一种通过 3D 结构图和大型语言模型协同作用增强蛋白质可溶性预测的方法。
BMC Bioinformatics. 2024 Jun 1;25(1):204. doi: 10.1186/s12859-024-05820-8.
4
DeepSol: a deep learning framework for sequence-based protein solubility prediction.DeepSol:一种基于序列的蛋白质可溶性预测的深度学习框架。
Bioinformatics. 2018 Aug 1;34(15):2605-2613. doi: 10.1093/bioinformatics/bty166.
5
Solubis: a webserver to reduce protein aggregation through mutation.Solubis:一个通过突变减少蛋白质聚集的网络服务器。
Protein Eng Des Sel. 2016 Aug;29(8):285-9. doi: 10.1093/protein/gzw019. Epub 2016 Jun 9.
6
PaRSnIP: sequence-based protein solubility prediction using gradient boosting machine.PaRSnIP:基于梯度提升机的序列基蛋白质溶解性预测。
Bioinformatics. 2018 Apr 1;34(7):1092-1098. doi: 10.1093/bioinformatics/btx662.
7
Computational analysis of the amino acid interactions that promote or decrease protein solubility.促进或降低蛋白质可溶性的氨基酸相互作用的计算分析。
Sci Rep. 2018 Oct 2;8(1):14661. doi: 10.1038/s41598-018-32988-w.
8
AGGRESCAN: method, application, and perspectives for drug design.AGGRESCAN:药物设计的方法、应用及前景
Methods Mol Biol. 2012;819:199-220. doi: 10.1007/978-1-61779-465-0_14.
9
AggreProt: a web server for predicting and engineering aggregation prone regions in proteins.AggreProt:一个用于预测和设计蛋白质中易于聚集区域的网络服务器。
Nucleic Acids Res. 2024 Jul 5;52(W1):W159-W169. doi: 10.1093/nar/gkae420.
10
RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins.RONN:应用于检测蛋白质天然无序区域的生物基础功能神经网络技术。
Bioinformatics. 2005 Aug 15;21(16):3369-76. doi: 10.1093/bioinformatics/bti534. Epub 2005 Jun 9.

引用本文的文献

1
SOuLMuSiC, a novel tool for predicting the impact of mutations on protein solubility.SOuLMuSiC,一种预测突变对蛋白质溶解度影响的新型工具。
Sci Rep. 2025 Jul 29;15(1):27531. doi: 10.1038/s41598-025-11326-x.
2
FINCHES: A Computational Framework for Predicting Intermolecular Interactions in Intrinsically Disordered Proteins.雀类:一种预测内在无序蛋白质分子间相互作用的计算框架。
Int J Mol Sci. 2025 Jun 28;26(13):6246. doi: 10.3390/ijms26136246.
3
GRACE: Generative Redesign in Artificial Computational Enzymology.GRACE:人工计算酶学中的生成式重新设计

本文引用的文献

1
Exploring the limitations of biophysical propensity scales coupled with machine learning for protein sequence analysis.探索结合机器学习的生物物理倾向性尺度在蛋白质序列分析中的局限性。
Sci Rep. 2019 Nov 15;9(1):16932. doi: 10.1038/s41598-019-53324-w.
2
Ultra-fast global homology detection with Discrete Cosine Transform and Dynamic Time Warping.基于离散余弦变换和动态时间规整的超快速全局同源检测。
Bioinformatics. 2018 Sep 15;34(18):3118-3125. doi: 10.1093/bioinformatics/bty309.
3
DeepSol: a deep learning framework for sequence-based protein solubility prediction.
ACS Synth Biol. 2024 Dec 20;13(12):4154-4164. doi: 10.1021/acssynbio.4c00624. Epub 2024 Nov 8.
4
PLM_Sol: predicting protein solubility by benchmarking multiple protein language models with the updated Escherichia coli protein solubility dataset.PLM_Sol:通过使用更新的大肠杆菌蛋白质可溶性数据集对多个蛋白质语言模型进行基准测试来预测蛋白质可溶性。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae404.
5
Stability of Protein Pharmaceuticals: Recent Advances.蛋白质类药物的稳定性:最新进展
Pharm Res. 2024 Jul;41(7):1301-1367. doi: 10.1007/s11095-024-03726-x. Epub 2024 Jun 27.
6
Advanced computational approaches to understand protein aggregation.用于理解蛋白质聚集的先进计算方法。
Biophys Rev (Melville). 2024 Apr 24;5(2):021302. doi: 10.1063/5.0180691. eCollection 2024 Jun.
7
DOTAD: A Database of Therapeutic Antibody Developability.DOTAD:治疗性抗体可开发性数据库。
Interdiscip Sci. 2024 Sep;16(3):623-634. doi: 10.1007/s12539-024-00613-2. Epub 2024 Mar 26.
8
Accelerating therapeutic protein design with computational approaches toward the clinical stage.利用计算方法加速治疗性蛋白质设计迈向临床阶段。
Comput Struct Biotechnol J. 2023 Apr 29;21:2909-2926. doi: 10.1016/j.csbj.2023.04.027. eCollection 2023.
9
CysPresso: a classification model utilizing deep learning protein representations to predict recombinant expression of cysteine-dense peptides.CysPresso:一种利用深度学习蛋白质表示来预测半胱氨酸密集肽重组表达的分类模型。
BMC Bioinformatics. 2023 May 16;24(1):200. doi: 10.1186/s12859-023-05327-8.
10
A Novel Strategy to Identify Endolysins with Lytic Activity against Methicillin-Resistant .一种鉴定针对耐甲氧西林. 的溶菌素的新型策略
Int J Mol Sci. 2023 Mar 17;24(6):5772. doi: 10.3390/ijms24065772.
DeepSol:一种基于序列的蛋白质可溶性预测的深度学习框架。
Bioinformatics. 2018 Aug 1;34(15):2605-2613. doi: 10.1093/bioinformatics/bty166.
4
PaRSnIP: sequence-based protein solubility prediction using gradient boosting machine.PaRSnIP:基于梯度提升机的序列基蛋白质溶解性预测。
Bioinformatics. 2018 Apr 1;34(7):1092-1098. doi: 10.1093/bioinformatics/btx662.
5
Exploring the Sequence-based Prediction of Folding Initiation Sites in Proteins.探索基于序列的蛋白质折叠起始位点预测。
Sci Rep. 2017 Aug 18;7(1):8826. doi: 10.1038/s41598-017-08366-3.
6
Protein homeostasis of a metastable subproteome associated with Alzheimer's disease.与阿尔茨海默病相关的亚稳定亚蛋白组的蛋白质平衡。
Proc Natl Acad Sci U S A. 2017 Jul 11;114(28):E5703-E5711. doi: 10.1073/pnas.1618417114. Epub 2017 Jun 26.
7
SODA: prediction of protein solubility from disorder and aggregation propensity.SODA:从无序和聚集倾向预测蛋白质溶解度。
Nucleic Acids Res. 2017 Jul 3;45(W1):W236-W240. doi: 10.1093/nar/gkx412.
8
Protein Misfolding, Amyloid Formation, and Human Disease: A Summary of Progress Over the Last Decade.蛋白质错误折叠、淀粉样纤维形成与人类疾病:过去十年研究进展综述。
Annu Rev Biochem. 2017 Jun 20;86:27-68. doi: 10.1146/annurev-biochem-061516-045115. Epub 2017 May 12.
9
Observation selection bias in contact prediction and its implications for structural bioinformatics.接触预测中的观测选择偏差及其对结构生物信息学的影响。
Sci Rep. 2016 Nov 18;6:36679. doi: 10.1038/srep36679.
10
The CamSol method of rational design of protein mutants with enhanced solubility.CamSol 方法:理性设计提高可溶性的蛋白质突变体。
J Mol Biol. 2015 Jan 30;427(2):478-90. doi: 10.1016/j.jmb.2014.09.026. Epub 2014 Oct 14.