Shrestha Palistha, Talwar Chandana S, Kandel Jeevan, Park Kwang-Hyun, Chong Kil To, Woo Eui-Jeon, Tayara Hilal
Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju, 54896, Jeollabuk-do, Republic of Korea.
Disease Target Structure Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), 125 Gwahak-ro, Yuseong-gu, Daejeon, 34141, Republic of Korea.
J Cheminform. 2025 Jun 16;17(1):96. doi: 10.1186/s13321-025-01040-1.
Nanobodies offer significant therapeutic potential due to their small size, stability, and versatility. Although advancements in computational protein design have made designing de novo nanobodies increasingly feasible, there are limited tools specifically tailored for this purpose. Rosetta with its specialized protocols, is a prominent tool for nanobody design but is limited by a high false-negative rate, necessitating extensive high-throughput screening. This results in increased costs, time, and labor due to the need for large-scale experimentation and detailed structural analysis. To address current challenges in nanobody design, we introduce NanoBinder, an interpretable machine learning model that predicts nanobody-antigen binding using Rosetta energy scores. NanoBinder utilizes a Random Forest model trained on experimentally validated complexes and can be seamlessly integrated into the Rosetta software. It employs SHAP summary plots for interpretability, which helps identify key features influencing binding interactions. Experimentally validated on forty-nine diverse nanobodies, NanoBinder accurately predicts non-binders and shows reasonable performance in identifying binders. This approach significantly enhances predictive accuracy, reduces the need for extensive experimental assays, and accelerates nanobody development, thereby offering a powerful tool to mitigate the costs, time, and labor associated with high-throughput screening.Scientific contribution This study introduces NanoBinder, a machine learning framework for predicting nanobody-antigen binding using Rosetta-derived energy features. Through rigorous experimental validation across diverse nanobody sets, NanoBinder enhances nanobody screening workflows by reducing false positives and minimizing reliance on extensive wet-lab assays. The approach bridges the gap between physics-based modeling and data-driven prediction in nanobody design.
纳米抗体因其体积小、稳定性高和多功能性而具有巨大的治疗潜力。尽管计算蛋白质设计方面的进展使从头设计纳米抗体变得越来越可行,但专门为此目的量身定制的工具却很有限。具有专门协议的Rosetta是纳米抗体设计的一个重要工具,但受限于高假阴性率,需要进行大量的高通量筛选。由于需要进行大规模实验和详细的结构分析,这导致成本、时间和劳动力增加。为了解决纳米抗体设计中的当前挑战,我们引入了NanoBinder,这是一种可解释的机器学习模型,它使用Rosetta能量分数预测纳米抗体与抗原的结合。NanoBinder利用在经过实验验证的复合物上训练的随机森林模型,可以无缝集成到Rosetta软件中。它采用SHAP摘要图进行可解释性分析,有助于识别影响结合相互作用的关键特征。在49种不同的纳米抗体上进行实验验证,NanoBinder能够准确预测非结合剂,并在识别结合剂方面表现出合理的性能。这种方法显著提高了预测准确性,减少了对广泛实验分析的需求,并加速了纳米抗体的开发,从而提供了一个强大的工具来降低与高通量筛选相关的成本、时间和劳动力。科学贡献 本研究引入了NanoBinder,这是一个使用Rosetta衍生能量特征预测纳米抗体-抗原结合的机器学习框架。通过对不同纳米抗体集进行严格的实验验证,NanoBinder减少了假阳性并最大限度地减少了对大量湿实验室分析的依赖,从而增强了纳米抗体筛选工作流程。该方法弥合了纳米抗体设计中基于物理的建模和数据驱动预测之间的差距。