Department of Chemistry, Vanderbilt University, Nashville, Tennessee, United States.
Center of Structural Biology, Vanderbilt University, Nashville, Tennessee, United States.
MAbs. 2022 Jan-Dec;14(1):2068212. doi: 10.1080/19420862.2022.2068212.
The human adaptive immune response enables the targeting of epitopes on pathogens with high specificity. Infection with a pathogen induces somatic hyper-mutation and B-cell selection processes that govern the shape and diversity of the antibody sequence landscape. To date, even the largest immunome repertoires of adaptive immune receptors acquired by next-generation sequencing cannot fully capture the vast antibody sequence space of a single individual, which is estimated to be at least 10 potential sequences. Degeneracy of the genetic code means that the number of possible nucleotide triplets (64) is greater than the number of canonical amino acids (20), resulting in some amino acids being encoded by multiple triplets and different amino acids sharing the same nucleotide in 1 or 2 positions in the triplet. We hypothesize that the degeneracy of the genetic code can be used to statistically model an enlarged space of human antibody amino acid sequences, accommodating for the discrepancy between the observed and the hypothesized antibody sequence space. Facilitated by Bayesian statistics and immunome repertoire clustering, we calculated amino acid probabilities from single nucleotide frequencies to infer a human amino acid sequence space that is used to design human-like antibodies with Rosetta. We show that antibodies designed with our restraints are on average up to 16.6% more human-like in the V and J regions compared to the Rosetta designs produced without constraints. The human-likeness of the heavy-chain CDR3 region (CDRH3) could be increased for 8 of 27 antibodies compared to Rosetta designs with a similar number of mutations and could be successfully applied on antibodies to demonstrate humanization.
人类适应性免疫反应能够针对病原体上的表位进行高度特异性的靶向。病原体感染诱导体细胞超突变和 B 细胞选择过程,这些过程控制着抗体序列景观的形状和多样性。迄今为止,即使是通过下一代测序获得的适应性免疫受体的最大免疫组库也不能完全捕获单个个体的庞大抗体序列空间,据估计,至少有 10 个潜在序列。遗传密码的简并性意味着可能的三核苷酸(64)数量大于典型氨基酸(20)的数量,导致一些氨基酸由多个三核苷酸编码,并且不同的氨基酸在三核苷酸的 1 或 2 个位置共享相同的核苷酸。我们假设遗传密码的简并性可用于对人类抗体氨基酸序列的放大空间进行统计建模,以适应观察到的和假设的抗体序列空间之间的差异。通过贝叶斯统计学和免疫组库聚类,我们从单核苷酸频率计算氨基酸概率,推断出一个人类氨基酸序列空间,用于使用 Rosetta 设计类人抗体。我们表明,与没有约束的 Rosetta 设计相比,使用我们的约束设计的抗体在 V 和 J 区的平均类人程度提高了 16.6%。与具有相似突变数量的 Rosetta 设计相比,27 个抗体中有 8 个的重链 CDR3 区(CDRH3)的类人程度可以提高,并且可以成功应用于抗体来证明其人类化。