Tubiana Jérôme, Schneidman-Duhovny Dina, Wolfson Haim J
Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, Israel.
School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem, Israel.
Nat Methods. 2022 Jun;19(6):730-739. doi: 10.1038/s41592-022-01490-7. Epub 2022 May 30.
Predicting the functional sites of a protein from its structure, such as the binding sites of small molecules, other proteins or antibodies, sheds light on its function in vivo. Currently, two classes of methods prevail: machine learning models built on top of handcrafted features and comparative modeling. They are, respectively, limited by the expressivity of the handcrafted features and the availability of similar proteins. Here, we introduce ScanNet, an end-to-end, interpretable geometric deep learning model that learns features directly from 3D structures. ScanNet builds representations of atoms and amino acids based on the spatio-chemical arrangement of their neighbors. We train ScanNet for detecting protein-protein and protein-antibody binding sites, demonstrate its accuracy-including for unseen protein folds-and interpret the filters learned. Finally, we predict epitopes of the SARS-CoV-2 spike protein, validating known antigenic regions and predicting previously uncharacterized ones. Overall, ScanNet is a versatile, powerful and interpretable model suitable for functional site prediction tasks. A webserver for ScanNet is available from http://bioinfo3d.cs.tau.ac.il/ScanNet/ .
从蛋白质结构预测其功能位点,如小分子、其他蛋白质或抗体的结合位点,有助于揭示其在体内的功能。目前,两类方法较为流行:基于手工制作特征构建的机器学习模型和比较建模。它们分别受到手工制作特征的表达能力和相似蛋白质可用性的限制。在此,我们介绍ScanNet,这是一种端到端、可解释的几何深度学习模型,可直接从三维结构中学习特征。ScanNet根据原子和氨基酸邻居的空间化学排列构建它们的表示。我们训练ScanNet用于检测蛋白质-蛋白质和蛋白质-抗体结合位点,展示其准确性(包括对未见蛋白质折叠的准确性)并解释所学习的过滤器。最后,我们预测了SARS-CoV-2刺突蛋白的表位,验证了已知的抗原区域并预测了以前未表征的区域。总体而言,ScanNet是一个适用于功能位点预测任务的通用、强大且可解释的模型。可通过http://bioinfo3d.cs.tau.ac.il/ScanNet/获取ScanNet的网络服务器。