NaturalAntibody, Szczecin, Poland.
Discovery Data Science, Genmab, Copenhagen, Denmark.
MAbs. 2024 Jan-Dec;16(1):2361928. doi: 10.1080/19420862.2024.2361928. Epub 2024 Jun 6.
The naïve human antibody repertoire has theoretical access to an estimated > 10 antibodies. Identifying subsets of this prohibitively large space where therapeutically relevant antibodies may be found is useful for development of these agents. It was previously demonstrated that, despite the immense sequence space, different individuals can produce the same antibodies. It was also shown that therapeutic antibodies, which typically follow seemingly unnatural development processes, can arise independently naturally. To check for biases in how the sequence space is explored, we data mined public repositories to identify 220 bioprojects with a combined seven billion reads. Of these, we created a subset of human bioprojects that we make available as the AbNGS database (https://naturalantibody.com/ngs/). AbNGS contains 135 bioprojects with four billion productive human heavy variable region sequences and 385 million unique complementarity-determining region (CDR)-H3s. We find that 270,000 (0.07% of 385 million) unique CDR-H3s are highly public in that they occur in at least five of 135 bioprojects. Of 700 unique therapeutic CDR-H3, a total of 6% has direct matches in the small set of 270,000. This observation extends to a match between CDR-H3 and V-gene call as well. Thus, the subspace of shared ('public') CDR-H3s shows utility for serving as a starting point for therapeutic antibody design.
人类幼稚抗体库理论上可以接触到估计超过 10 亿种抗体。识别出这个庞大空间中的亚群,其中可能存在有治疗意义的抗体,这对于这些药物的开发很有用。先前已经证明,尽管序列空间巨大,但不同的个体可以产生相同的抗体。也表明,尽管治疗性抗体通常遵循看似非自然的开发过程,但它们可以自然独立地产生。为了检查序列空间的探索是否存在偏差,我们从公共存储库中挖掘数据,以确定 220 个生物项目,共有 70 亿个读数。在这些项目中,我们创建了一个人类生物项目子集,我们将其作为 AbNGS 数据库(https://naturalantibody.com/ngs/)提供。AbNGS 包含 135 个生物项目,其中包含 40 亿个人类重链可变区序列和 3.85 亿个独特的互补决定区(CDR)-H3。我们发现,27 万(3850 万的 0.07%)个独特的 CDR-H3 在至少 135 个生物项目中的 5 个以上高度公开。在 700 个独特的治疗性 CDR-H3 中,共有 6%在这 27 万的小集中有直接匹配。这一观察结果也扩展到了 CDR-H3 和 V 基因调用之间的匹配。因此,共享(“公共”)CDR-H3 的子空间对于作为治疗性抗体设计的起点很有用。