Hadsund Johannes Thorling, Satława Tadeusz, Janusz Bartosz, Shan Lu, Zhou Li, Röttger Richard, Krawczyk Konrad
Department Mathematics and Computer Science, University of Southern, Odense, 5230, Denmark.
NaturalAntibody, Szczecin, 71-899, Poland.
Bioinform Adv. 2024 Mar 6;4(1):vbae033. doi: 10.1093/bioadv/vbae033. eCollection 2024.
Nanobodies are a subclass of immunoglobulins, whose binding site consists of only one peptide chain, bestowing favorable biophysical properties. Recently, the first nanobody therapy was approved, paving the way for further clinical applications of this antibody format. Further development of nanobody-based therapeutics could be streamlined by computational methods. One of such methods is infilling-positional prediction of biologically feasible mutations in nanobodies. Being able to identify possible positional substitutions based on sequence context, facilitates functional design of such molecules.
Here we present nanoBERT, a nanobody-specific transformer to predict amino acids in a given position in a query sequence. We demonstrate the need to develop such machine-learning based protocol as opposed to gene-specific positional statistics since appropriate genetic reference is not available. We benchmark nanoBERT with respect to human-based language models and ESM-2, demonstrating the benefit for domain-specific language models. We also demonstrate the benefit of employing nanobody-specific predictions for fine-tuning on experimentally measured thermostability dataset. We hope that nanoBERT will help engineers in a range of predictive tasks for designing therapeutic nanobodies.
纳米抗体是免疫球蛋白的一个亚类,其结合位点仅由一条肽链组成,具有良好的生物物理特性。最近,首个纳米抗体疗法获得批准,为这种抗体形式的进一步临床应用铺平了道路。基于计算方法可以简化基于纳米抗体的治疗药物的进一步开发。其中一种方法是对纳米抗体中生物学上可行的突变进行填充位置预测。能够根据序列上下文识别可能的位置替换,有助于此类分子的功能设计。
在此,我们展示了nanoBERT,一种用于预测查询序列中给定位置氨基酸的纳米抗体特异性变换器。我们证明了与基于基因特异性位置统计的方法相比,开发这种基于机器学习的协议的必要性,因为没有合适的遗传参考。我们将nanoBERT与基于人类的语言模型和ESM-2进行了基准测试,证明了特定领域语言模型的优势。我们还展示了利用纳米抗体特异性预测对实验测量的热稳定性数据集进行微调的好处。我们希望nanoBERT将帮助工程师在一系列预测任务中设计治疗性纳米抗体。