Program in Molecular Biophysics, The Johns Hopkins University, Baltimore, MD 21218, USA.
Department of Computer Science, George Mason University, Fairfax, VA 22030, USA.
Bioinformatics. 2020 Jul 1;36(Suppl_1):i268-i275. doi: 10.1093/bioinformatics/btaa457.
Antibody structure is largely conserved, except for a complementarity-determining region featuring six variable loops. Five of these loops adopt canonical folds which can typically be predicted with existing methods, while the remaining loop (CDR H3) remains a challenge due to its highly diverse set of observed conformations. In recent years, deep neural networks have proven to be effective at capturing the complex patterns of protein structure. This work proposes DeepH3, a deep residual neural network that learns to predict inter-residue distances and orientations from antibody heavy and light chain sequence. The output of DeepH3 is a set of probability distributions over distances and orientation angles between pairs of residues. These distributions are converted to geometric potentials and used to discriminate between decoy structures produced by RosettaAntibody and predict new CDR H3 loop structures de novo.
When evaluated on the Rosetta antibody benchmark dataset of 49 targets, DeepH3-predicted potentials identified better, same and worse structures [measured by root-mean-squared distance (RMSD) from the experimental CDR H3 loop structure] than the standard Rosetta energy function for 33, 6 and 10 targets, respectively, and improved the average RMSD of predictions by 32.1% (1.4 Å). Analysis of individual geometric potentials revealed that inter-residue orientations were more effective than inter-residue distances for discriminating near-native CDR H3 loops. When applied to de novo prediction of CDR H3 loop structures, DeepH3 achieves an average RMSD of 2.2 ± 1.1 Å on the Rosetta antibody benchmark.
DeepH3 source code and pre-trained model parameters are freely available at https://github.com/Graylab/deepH3-distances-orientations.
Supplementary data are available at Bioinformatics online.
抗体结构在很大程度上是保守的,除了互补决定区的六个可变环。这五个环采用典型的折叠结构,通常可以用现有的方法预测,而剩下的环(CDR H3)由于其高度多样化的构象而仍然是一个挑战。近年来,深度神经网络已被证明在捕捉蛋白质结构的复杂模式方面非常有效。这项工作提出了 DeepH3,这是一种深度残差神经网络,可以从抗体的重链和轻链序列中学习预测残基间的距离和方向。DeepH3 的输出是一对残基之间距离和方向角的概率分布集合。这些分布被转换为几何势,并用于区分 RosettaAntibody 产生的诱饵结构,并从头预测新的 CDR H3 环结构。
在 49 个靶标的 Rosetta 抗体基准数据集上进行评估时,DeepH3 预测的势能比标准的 Rosetta 能量函数分别在 33、6 和 10 个靶标中更好、相同和更差地识别结构(通过与实验 CDR H3 环结构的均方根距离(RMSD)来衡量),并将预测的平均 RMSD 提高了 32.1%(1.4Å)。对个别几何势的分析表明,残基间的取向比残基间的距离更有利于区分近天然的 CDR H3 环。当应用于从头预测 CDR H3 环结构时,DeepH3 在 Rosetta 抗体基准上实现了平均 RMSD 为 2.2±1.1Å。
DeepH3 的源代码和预训练的模型参数可在 https://github.com/Graylab/deepH3-distances-orientations 上免费获得。
补充数据可在 Bioinformatics 在线获取。