• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人类中发生致病性变异的残基的溶剂可及性:从蛋白质结构到蛋白质序列

Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences.

作者信息

Savojardo Castrense, Manfredi Matteo, Martelli Pier Luigi, Casadio Rita

机构信息

Biocomputing Group, Department of Pharmacy and Biotechnologies, University of Bologna, Bologna, Italy.

Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies of the National Research Council, Bari, Italy.

出版信息

Front Mol Biosci. 2021 Jan 7;7:626363. doi: 10.3389/fmolb.2020.626363. eCollection 2020.

DOI:10.3389/fmolb.2020.626363
PMID:33490109
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7817970/
Abstract

Solvent accessibility (SASA) is a key feature of proteins for determining their folding and stability. SASA is computed from protein structures with different algorithms, and from protein sequences with machine-learning based approaches trained on solved structures. Here we ask the question as to which extent solvent exposure of residues can be associated to the pathogenicity of the variation. By this, SASA of the wild-type residue acquires a role in the context of functional annotation of protein single-residue variations (SRVs). By mapping variations on a curated database of human protein structures, we found that residues targeted by disease related SRVs are less accessible to solvent than residues involved in polymorphisms. The disease association is not evenly distributed among the different residue types: SRVs targeting glycine, tryptophan, tyrosine, and cysteine are more frequently disease associated than others. For all residues, the proportion of disease related SRVs largely increases when the wild-type residue is buried and decreases when it is exposed. The extent of the increase depends on the residue type. With the aid of an in house developed predictor, based on a deep learning procedure and performing at the state-of-the-art, we are able to confirm the above tendency by analyzing a large data set of residues subjected to variations and occurring in some 12,494 human protein sequences still lacking three-dimensional structure (derived from HUMSAVAR). Our data support the notion that surface accessible area is a distinguished property of residues that undergo variation and that pathogenicity is more frequently associated to the buried property than to the exposed one.

摘要

溶剂可及性(SASA)是决定蛋白质折叠和稳定性的关键特征。SASA可通过不同算法从蛋白质结构中计算得出,也可通过基于机器学习的方法从已解析结构训练的蛋白质序列中计算得出。在此,我们提出一个问题:残基的溶剂暴露程度在多大程度上与变异的致病性相关。由此,野生型残基的SASA在蛋白质单残基变异(SRV)的功能注释背景下发挥作用。通过将变异映射到一个精心策划的人类蛋白质结构数据库上,我们发现与疾病相关的SRV靶向的残基比多态性涉及的残基更不易被溶剂接触。疾病关联在不同残基类型中分布并不均匀:靶向甘氨酸、色氨酸、酪氨酸和半胱氨酸的SRV比其他的更常与疾病相关。对于所有残基,当野生型残基被掩埋时,与疾病相关的SRV比例大幅增加,而当它暴露时则降低。增加的程度取决于残基类型。借助一个内部开发的基于深度学习程序且处于当前先进水平的预测器,我们通过分析大约12494个人类蛋白质序列(源自HUMSAVAR)中发生变异的大量残基数据集,能够证实上述趋势。我们的数据支持这样一种观点,即表面可及面积是发生变异的残基的一个显著特性,并且致病性更常与掩埋特性相关,而非与暴露特性相关。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/7c6757ab1275/fmolb-07-626363-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/da1c3da61e92/fmolb-07-626363-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/207086210192/fmolb-07-626363-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/84fe0b493fbd/fmolb-07-626363-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/54b784c24386/fmolb-07-626363-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/e1379c09cb6e/fmolb-07-626363-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/7c6757ab1275/fmolb-07-626363-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/da1c3da61e92/fmolb-07-626363-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/207086210192/fmolb-07-626363-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/84fe0b493fbd/fmolb-07-626363-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/54b784c24386/fmolb-07-626363-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/e1379c09cb6e/fmolb-07-626363-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47d6/7817970/7c6757ab1275/fmolb-07-626363-g0006.jpg

相似文献

1
Solvent Accessibility of Residues Undergoing Pathogenic Variations in Humans: From Protein Structures to Protein Sequences.人类中发生致病性变异的残基的溶剂可及性:从蛋白质结构到蛋白质序列
Front Mol Biosci. 2021 Jan 7;7:626363. doi: 10.3389/fmolb.2020.626363. eCollection 2020.
2
Large scale analysis of protein stability in OMIM disease related human protein variants.在线人类孟德尔遗传数据库(OMIM)疾病相关人类蛋白质变体的蛋白质稳定性大规模分析。
BMC Genomics. 2016 Jun 23;17 Suppl 2(Suppl 2):397. doi: 10.1186/s12864-016-2726-y.
3
A hydrophobic spine stabilizes a surface-exposed α-helix according to analysis of the solvent-accessible surface area.根据溶剂可及表面积分析,疏水主链稳定了表面暴露的α螺旋。
BMC Bioinformatics. 2016 Dec 22;17(Suppl 19):503. doi: 10.1186/s12859-016-1368-z.
4
Sequence based residue depth prediction using evolutionary information and predicted secondary structure.基于序列的残基深度预测,利用进化信息和预测的二级结构。
BMC Bioinformatics. 2008 Sep 20;9:388. doi: 10.1186/1471-2105-9-388.
5
Tri-peptide reference structures for the calculation of relative solvent accessible surface area in protein amino acid residues.用于计算蛋白质氨基酸残基相对溶剂可及表面积的三肽参考结构。
Comput Biol Chem. 2015 Feb;54:33-43. doi: 10.1016/j.compbiolchem.2014.11.007. Epub 2014 Dec 3.
6
A review of methods available to estimate solvent-accessible surface areas of soluble proteins in the folded and unfolded states.对用于估计折叠态和未折叠态可溶性蛋白质溶剂可及表面积的现有方法的综述。
Curr Protein Pept Sci. 2014;15(5):456-76. doi: 10.2174/1389203715666140327114232.
7
DeepREx-WS: A web server for characterising protein-solvent interaction starting from sequence.DeepREx-WS:一个从序列开始表征蛋白质-溶剂相互作用的网络服务器。
Comput Struct Biotechnol J. 2021 Oct 13;19:5791-5799. doi: 10.1016/j.csbj.2021.10.016. eCollection 2021.
8
Lid opening and conformational stability of T1 Lipase is mediated by increasing chain length polar solvents.T1脂肪酶的开盖和构象稳定性是由链长增加的极性溶剂介导的。
PeerJ. 2017 May 18;5:e3341. doi: 10.7717/peerj.3341. eCollection 2017.
9
Context dependent reference states of solvent accessibility derived from native protein structures and assessed by predictability analysis.基于天然蛋白质结构并通过可预测性分析评估得到的溶剂可及性的上下文相关参考状态。
BMC Struct Biol. 2009 Apr 27;9:25. doi: 10.1186/1472-6807-9-25.
10
Interior and surface of monomeric proteins.单体蛋白质的内部和表面。
J Mol Biol. 1987 Aug 5;196(3):641-56. doi: 10.1016/0022-2836(87)90038-6.

引用本文的文献

1
Alternative therapeutic approaches for combating multi-drug-resistant bacteria: Reverse vaccinology against Enterobacter cloacae.对抗多重耐药菌的替代治疗方法:针对阴沟肠杆菌的反向疫苗学
J Genet Eng Biotechnol. 2025 Sep;23(3):100519. doi: 10.1016/j.jgeb.2025.100519. Epub 2025 Jun 17.
2
In-silico screening of small compounds against Lassa fever haemorrhagic virus nucleoprotein.针对拉沙热出血热病毒核蛋白的小分子化合物的计算机模拟筛选
Sci Rep. 2025 Aug 20;15(1):30558. doi: 10.1038/s41598-025-89989-9.
3
Targeted modulation of MMP9 and GRP78 via molecular interaction and in silico profiling of Curcuma caesia rhizome metabolites: A computational drug discovery approach for cancer therapy.

本文引用的文献

1
HH-suite3 for fast remote homology detection and deep protein annotation.HH-suite3 用于快速远程同源检测和深度蛋白质注释。
BMC Bioinformatics. 2019 Sep 14;20(1):473. doi: 10.1186/s12859-019-3019-7.
2
PaleAle 5.0: prediction of protein relative solvent accessibility by deep learning.PaleAle 5.0:通过深度学习预测蛋白质相对溶剂可及性。
Amino Acids. 2019 Sep;51(9):1289-1296. doi: 10.1007/s00726-019-02767-6. Epub 2019 Aug 6.
3
Functional and Structural Features of Disease-Related Protein Variants.疾病相关蛋白变异体的功能和结构特征。
通过莪术根茎代谢物的分子相互作用和计算机模拟分析对基质金属蛋白酶9和葡萄糖调节蛋白78进行靶向调控:一种用于癌症治疗的计算机辅助药物发现方法。
PLoS One. 2025 Jul 18;20(7):e0328509. doi: 10.1371/journal.pone.0328509. eCollection 2025.
4
Charting γ-secretase substrates by explainable AI.通过可解释人工智能绘制γ-分泌酶底物图谱。
Nat Commun. 2025 Jul 1;16(1):5428. doi: 10.1038/s41467-025-60638-z.
5
Sequence-Based Prediction for Protein Solvent Accessibility.基于序列的蛋白质溶剂可及性预测
Int J Mol Sci. 2025 Jun 11;26(12):5604. doi: 10.3390/ijms26125604.
6
Comprehensive assessment of AlphaFold's predictions of secondary structure and solvent accessibility at the amino acid-level in eukaryotic, bacterial and archaeal proteins.对AlphaFold在真核生物、细菌和古细菌蛋白质氨基酸水平上的二级结构和溶剂可及性预测进行全面评估。
Comput Struct Biotechnol J. 2025 May 29;27:2443-2449. doi: 10.1016/j.csbj.2025.05.047. eCollection 2025.
7
Computational prediction of deleterious nonsynonymous SNPs in the CTNS gene: implications for cystinosis.CTNS基因中有害非同义单核苷酸多态性的计算预测:对胱氨酸病的意义。
BMC Genom Data. 2025 May 15;26(1):35. doi: 10.1186/s12863-025-01325-2.
8
Digging out the Molecular Connections between the Catalytic Mechanism of Human Lysosomal α-Mannosidase and Its Pathophysiology.挖掘人类溶酶体α-甘露糖苷酶催化机制与其病理生理学之间的分子联系。
J Chem Inf Model. 2025 Mar 10;65(5):2650-2659. doi: 10.1021/acs.jcim.4c02229. Epub 2025 Feb 20.
9
Navigating Uncertainty: Assessing Variants of Uncertain Significance in the CDKL5 Gene for Developmental and Epileptic Encephalopathy Using In Silico Prediction Tools and Computational Analysis.应对不确定性:使用计算机预测工具和计算分析评估发育性和癫痫性脑病中CDKL5基因意义未明的变异体
J Mol Neurosci. 2025 Feb 13;75(1):19. doi: 10.1007/s12031-024-02299-z.
10
Functional evaluation of novel compound heterozygous variants in SLC12A3 of Gitelman syndrome.吉特曼综合征SLC12A3基因新型复合杂合变异的功能评估
Orphanet J Rare Dis. 2025 Feb 11;20(1):66. doi: 10.1186/s13023-025-03577-8.
Int J Mol Sci. 2019 Mar 27;20(7):1530. doi: 10.3390/ijms20071530.
4
NetSurfP-2.0: Improved prediction of protein structural features by integrated deep learning.NetSurfP-2.0:通过集成深度学习改进蛋白质结构特征预测。
Proteins. 2019 Jun;87(6):520-527. doi: 10.1002/prot.25674. Epub 2019 Mar 9.
5
Accurate prediction of protein relative solvent accessibility using a balanced model.使用平衡模型准确预测蛋白质相对溶剂可及性。
BioData Min. 2017 Jan 24;10:1. doi: 10.1186/s13040-016-0121-5. eCollection 2017.
6
Uniclust databases of clustered and deeply annotated protein sequences and alignments.经过聚类和深度注释的蛋白质序列及比对的单簇数据库。
Nucleic Acids Res. 2017 Jan 4;45(D1):D170-D176. doi: 10.1093/nar/gkw1081. Epub 2016 Nov 28.
7
Large scale analysis of protein stability in OMIM disease related human protein variants.在线人类孟德尔遗传数据库(OMIM)疾病相关人类蛋白质变体的蛋白质稳定性大规模分析。
BMC Genomics. 2016 Jun 23;17 Suppl 2(Suppl 2):397. doi: 10.1186/s12864-016-2726-y.
8
PredRSA: a gradient boosted regression trees approach for predicting protein solvent accessibility.PredRSA:一种用于预测蛋白质溶剂可及性的梯度提升回归树方法。
BMC Bioinformatics. 2016 Jan 11;17 Suppl 1(Suppl 1):8. doi: 10.1186/s12859-015-0851-2.
9
AcconPred: Predicting Solvent Accessibility and Contact Number Simultaneously by a Multitask Learning Framework under the Conditional Neural Fields Model.AcconPred:在条件神经场模型下通过多任务学习框架同时预测溶剂可及性和接触数
Biomed Res Int. 2015;2015:678764. doi: 10.1155/2015/678764. Epub 2015 Aug 3.
10
JPred4: a protein secondary structure prediction server.JPred4:一种蛋白质二级结构预测服务器。
Nucleic Acids Res. 2015 Jul 1;43(W1):W389-94. doi: 10.1093/nar/gkv332. Epub 2015 Apr 16.