Department of Physics, University of Washington, Seattle, WA 98195.
The Department for Statistical Physics of Evolving Systems, Max Planck Institute for Dynamics and Self-Organization, Göttingen 37077, Germany.
Proc Natl Acad Sci U S A. 2024 Feb 6;121(6):e2300838121. doi: 10.1073/pnas.2300838121. Epub 2024 Feb 1.
Proteins play a central role in biology from immune recognition to brain activity. While major advances in machine learning have improved our ability to predict protein structure from sequence, determining protein function from its sequence or structure remains a major challenge. Here, we introduce holographic convolutional neural network (H-CNN) for proteins, which is a physically motivated machine learning approach to model amino acid preferences in protein structures. H-CNN reflects physical interactions in a protein structure and recapitulates the functional information stored in evolutionary data. H-CNN accurately predicts the impact of mutations on protein stability and binding of protein complexes. Our interpretable computational model for protein structure-function maps could guide design of novel proteins with desired function.
蛋白质在生物学中起着核心作用,从免疫识别到大脑活动。虽然机器学习的重大进展提高了我们从序列预测蛋白质结构的能力,但从序列或结构确定蛋白质功能仍然是一个主要挑战。在这里,我们介绍了用于蛋白质的全息卷积神经网络 (H-CNN),这是一种基于物理的机器学习方法,用于模拟蛋白质结构中氨基酸的偏好。H-CNN 反映了蛋白质结构中的物理相互作用,并概括了进化数据中存储的功能信息。H-CNN 可以准确预测突变对蛋白质稳定性和蛋白质复合物结合的影响。我们用于蛋白质结构-功能图谱的可解释计算模型可以指导设计具有所需功能的新型蛋白质。