• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习在高丰度细胞质蛋白表面特征定量分析中的应用:迈向基于人工智能的仿生学

Application of Machine Learning in the Quantitative Analysis of the Surface Characteristics of Highly Abundant Cytoplasmic Proteins: Toward AI-Based Biomimetics.

作者信息

Moon Jooa, Hu Guanghao, Hayashi Tomohiro

机构信息

Department of Materials Science and Engineering, School of Materials and Chemical Technology, Tokyo Institute of Technology, Yokohama 226-8502, Japan.

The Institute for Solid State Physics, The University of Tokyo, Kashiwa 277-0882, Japan.

出版信息

Biomimetics (Basel). 2024 Mar 6;9(3):162. doi: 10.3390/biomimetics9030162.

DOI:10.3390/biomimetics9030162
PMID:38534847
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10967800/
Abstract

Proteins in the crowded environment of human cells have often been studied regarding nonspecific interactions, misfolding, and aggregation, which may cause cellular malfunction and disease. Specifically, proteins with high abundance are more susceptible to these issues due to the law of mass action. Therefore, the surfaces of highly abundant cytoplasmic (HAC) proteins directly exposed to the environment can exhibit specific physicochemical, structural, and geometrical characteristics that reduce nonspecific interactions and adapt to the environment. However, the quantitative relationships between the overall surface descriptors still need clarification. Here, we used machine learning to identify HAC proteins using hydrophobicity, charge, roughness, secondary structures, and B-factor from the protein surfaces and quantified the contribution of each descriptor. First, several supervised learning algorithms were compared to solve binary classification problems for the surfaces of HAC and extracellular proteins. Then, logistic regression was used for the feature importance analysis of descriptors considering model performance (80.2% accuracy and 87.6% AUC) and interpretability. The HAC proteins showed positive correlations with negatively and positively charged areas but negative correlations with hydrophobicity, the B-factor, the proportion of beta structures, roughness, and the proportion of disordered regions. Finally, the details of each descriptor could be explained concerning adaptative surface strategies of HAC proteins to regulate nonspecific interactions, protein folding, flexibility, stability, and adsorption. This study presented a novel approach using various surface descriptors to identify HAC proteins and provided quantitative design rules for the surfaces well-suited to human cellular crowded environments.

摘要

在人类细胞的拥挤环境中,蛋白质常常被研究其非特异性相互作用、错误折叠和聚集,这些可能导致细胞功能障碍和疾病。具体而言,由于质量作用定律,高丰度蛋白质更容易出现这些问题。因此,直接暴露于环境中的高丰度细胞质(HAC)蛋白质的表面可能表现出特定的物理化学、结构和几何特征,以减少非特异性相互作用并适应环境。然而,整体表面描述符之间的定量关系仍需阐明。在这里,我们使用机器学习,通过蛋白质表面的疏水性、电荷、粗糙度、二级结构和B因子来识别HAC蛋白质,并量化每个描述符的贡献。首先,比较了几种监督学习算法,以解决HAC蛋白质和细胞外蛋白质表面的二元分类问题。然后,考虑到模型性能(准确率80.2%,AUC 87.6%)和可解释性,使用逻辑回归对描述符进行特征重要性分析。HAC蛋白质与带负电荷和正电荷的区域呈正相关,但与疏水性、B因子、β结构比例、粗糙度和无序区域比例呈负相关。最后,可以根据HAC蛋白质调节非特异性相互作用、蛋白质折叠、灵活性、稳定性和吸附的适应性表面策略来解释每个描述符的细节。本研究提出了一种使用各种表面描述符识别HAC蛋白质的新方法,并为适合人类细胞拥挤环境的表面提供了定量设计规则。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/1214f1b39f49/biomimetics-09-00162-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/68789f61b2a6/biomimetics-09-00162-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/a4bcdd603b35/biomimetics-09-00162-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/1d830632a69b/biomimetics-09-00162-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/9904e33efa55/biomimetics-09-00162-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/3095a0251638/biomimetics-09-00162-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/b2649d4bcf3a/biomimetics-09-00162-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/4f2907ddcd57/biomimetics-09-00162-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/1214f1b39f49/biomimetics-09-00162-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/68789f61b2a6/biomimetics-09-00162-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/a4bcdd603b35/biomimetics-09-00162-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/1d830632a69b/biomimetics-09-00162-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/9904e33efa55/biomimetics-09-00162-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/3095a0251638/biomimetics-09-00162-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/b2649d4bcf3a/biomimetics-09-00162-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/4f2907ddcd57/biomimetics-09-00162-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a849/10967800/1214f1b39f49/biomimetics-09-00162-g008.jpg

相似文献

1
Application of Machine Learning in the Quantitative Analysis of the Surface Characteristics of Highly Abundant Cytoplasmic Proteins: Toward AI-Based Biomimetics.机器学习在高丰度细胞质蛋白表面特征定量分析中的应用:迈向基于人工智能的仿生学
Biomimetics (Basel). 2024 Mar 6;9(3):162. doi: 10.3390/biomimetics9030162.
2
Macromolecular crowding: chemistry and physics meet biology (Ascona, Switzerland, 10-14 June 2012).大分子拥挤现象:化学与物理邂逅生物学(瑞士阿斯科纳,2012年6月10日至14日)
Phys Biol. 2013 Aug;10(4):040301. doi: 10.1088/1478-3975/10/4/040301. Epub 2013 Aug 2.
3
A Multiscale Topographical Analysis Based on Morphological Information: The HEVC Multiscale Decomposition.基于形态学信息的多尺度地形分析:高效视频编码(HEVC)多尺度分解
Materials (Basel). 2020 Dec 7;13(23):5582. doi: 10.3390/ma13235582.
4
Prediction of the Lotus Effect on Solid Surfaces by Machine Learning.机器学习预测固-液界面的莲花效应。
Small. 2022 Oct;18(41):e2203264. doi: 10.1002/smll.202203264. Epub 2022 Sep 7.
5
A Machine Learning-Based QSAR Model for Benzimidazole Derivatives as Corrosion Inhibitors by Incorporating Comprehensive Feature Selection.基于机器学习的苯并咪唑衍生物作为缓蚀剂的 QSAR 模型,综合特征选择。
Interdiscip Sci. 2019 Dec;11(4):738-747. doi: 10.1007/s12539-019-00346-7. Epub 2019 Sep 4.
6
Improving virtual screening predictive accuracy of Human kallikrein 5 inhibitors using machine learning models.使用机器学习模型提高人激肽释放酶5抑制剂的虚拟筛选预测准确性。
Comput Biol Chem. 2017 Aug;69:110-119. doi: 10.1016/j.compbiolchem.2017.05.007. Epub 2017 May 29.
7
Comparative study on effects of pH, electrolytes, and humic acid on the stability of acetic and polyacrylic acid coated magnetite nanoparticles.比较研究 pH 值、电解质和腐殖酸对醋酸和聚丙烯酸包覆磁铁矿纳米粒子稳定性的影响。
Chemosphere. 2023 Apr;319:137992. doi: 10.1016/j.chemosphere.2023.137992. Epub 2023 Jan 28.
8
Hierarchy-aware contrastive learning with late fusion for skin lesion classification.基于晚期融合的具有层次感知的对比学习在皮肤病变分类中的应用。
Comput Methods Programs Biomed. 2022 Apr;216:106666. doi: 10.1016/j.cmpb.2022.106666. Epub 2022 Jan 26.
9
Machine Learning-Enabled Design and Prediction of Protein Resistance on Self-Assembled Monolayers and Beyond.基于机器学习的自组装单分子层及其相关表面抗蛋白质设计与预测
ACS Appl Mater Interfaces. 2021 Mar 10;13(9):11306-11319. doi: 10.1021/acsami.1c00642. Epub 2021 Feb 26.
10
Classification of lung cancer tumors based on structural and physicochemical properties of proteins by bioinformatics models.基于生物信息学模型的肺癌肿瘤的结构和理化特性分类。
PLoS One. 2012;7(7):e40017. doi: 10.1371/journal.pone.0040017. Epub 2012 Jul 19.

本文引用的文献

1
Connecting the Dots: Macromolecular Crowding and Protein Aggregation.连点成线:大分子拥挤与蛋白质聚集
J Fluoresc. 2023 Jan;33(1):1-11. doi: 10.1007/s10895-022-03082-2. Epub 2022 Nov 22.
2
Putting AlphaFold models to work with phenix.process_predicted_model and ISOLDE.利用 phenix.process_predicted_model 和 ISOLDE 运行 AlphaFold 模型。
Acta Crystallogr D Struct Biol. 2022 Nov 1;78(Pt 11):1303-1314. doi: 10.1107/S2059798322010026. Epub 2022 Oct 27.
3
: an easy-to-use program for analyzing cavities, volumes and surface areas of chemical structures.
一个用于分析化学结构的空腔、体积和表面积的易于使用的程序。
J Appl Crystallogr. 2022 Jun 23;55(Pt 4):1033-1044. doi: 10.1107/S1600576722004988. eCollection 2022 Aug 1.
4
AlphaFold2 models indicate that protein sequence determines both structure and dynamics.AlphaFold2 模型表明,蛋白质序列决定了结构和动力学。
Sci Rep. 2022 Jun 23;12(1):10696. doi: 10.1038/s41598-022-14382-9.
5
Protein Subcellular Localization Prediction Model Based on Graph Convolutional Network.基于图卷积网络的蛋白质亚细胞定位预测模型
Interdiscip Sci. 2022 Dec;14(4):937-946. doi: 10.1007/s12539-022-00529-9. Epub 2022 Jun 17.
6
Solubility of proteins.蛋白质的溶解度
ADMET DMPK. 2020 Jun 28;8(4):391-399. doi: 10.5599/admet.831. eCollection 2020.
7
AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models.AlphaFold 蛋白质结构数据库:用高精度模型极大地扩展蛋白质序列空间的结构覆盖范围。
Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444. doi: 10.1093/nar/gkab1061.
8
Protein- and Cell-Resistance of Zwitterionic Peptide-Based Self-Assembled Monolayers: Anti-Biofouling Tests and Surface Force Analysis.基于两性离子肽的自组装单分子层的蛋白质和细胞抗性:抗生物污损测试与表面力分析
Front Chem. 2021 Oct 6;9:748017. doi: 10.3389/fchem.2021.748017. eCollection 2021.
9
Intrinsically disordered proteins: modes of binding with emphasis on disordered domains.无规则蛋白质:结合模式,重点关注无规则结构域。
Open Biol. 2021 Oct;11(10):210222. doi: 10.1098/rsob.210222. Epub 2021 Oct 6.
10
Machine-learning methods for ligand-protein molecular docking.基于机器学习的配体-蛋白分子对接方法。
Drug Discov Today. 2022 Jan;27(1):151-164. doi: 10.1016/j.drudis.2021.09.007. Epub 2021 Sep 21.