Messemaker Marius, Kwee Bjørn P Y, Moravec Živa, Álvarez-Salmoral Daniel, Urbanus Jos, de Paauw Sam, Geerligs Jeroen, Voogd Rhianne, Morris Ben, Guislain Aurélie, Mußmann Maike, Winkler Yaël, Steinmetz Maxime, Iras Matyas, Marcus Eric, Teuwen Jonas, Perrakis Anastassis, Beijersbergen Roderick L, Scheper Wouter, Schumacher Ton N
Division of Molecular Oncology & Immunology, The Netherlands Cancer Institute, Amsterdam, The Netherlands.
Oncode Institute, Utrecht, The Netherlands.
bioRxiv. 2025 May 12:2025.04.28.651095. doi: 10.1101/2025.04.28.651095.
Accurate prediction of TCR specificity forms a holy grail in immunology and large language models and computational structure predictions provide a path to achieve this. Importantly, current TCR-pMHC prediction models have been trained and evaluated using historical data of unknown quality. Here, we develop and utilize a high-throughput synthetic platform for TCR assembly and evaluation to assess a large fraction of VDJdb-deposited TCR-pMHC entries using a standardized readout of TCR function. Strikingly, this analysis demonstrates that claimed TCR reactivity is only confirmed for 50% of evaluated entries. Intriguingly, the use of TCRbridge to analyze AlphaFold3 confidence metrics reveals a substantial performance in distinguishing functionally validating and non-validating TCRs even though AlphaFold3 was not trained on this task, demonstrating the utility of the validated VDJdb (TCRvdb) database that we generated. We provide TCRvdb as a resource to the community to support training and evaluation of improved predictive TCR specificity models.
准确预测TCR特异性是免疫学中的一个圣杯,而大语言模型和计算结构预测为实现这一目标提供了一条途径。重要的是,当前的TCR-pMHC预测模型是使用质量未知的历史数据进行训练和评估的。在这里,我们开发并利用了一个用于TCR组装和评估的高通量合成平台,以使用TCR功能的标准化读数来评估大量VDJdb中存储的TCR-pMHC条目。令人惊讶的是,该分析表明,在评估的条目中,只有50%的条目的TCR反应性得到了证实。有趣的是,使用TCRbridge分析AlphaFold3置信度指标显示,即使AlphaFold3没有针对此任务进行训练,它在区分功能上有效的和无效的TCR方面也具有显著性能,这证明了我们生成的经过验证的VDJdb(TCRvdb)数据库的实用性。我们将TCRvdb作为一种资源提供给社区,以支持改进的TCR特异性预测模型的训练和评估。