Pavlović Milena, Scheffer Lonneke, Motwani Keshav, Kanduri Chakravarthi, Kompova Radmila, Vazov Nikolay, Waagan Knut, Bernal Fabian L M, Costa Alexandre Almeida, Corrie Brian, Akbar Rahmad, Al Hajj Ghadi S, Balaban Gabriel, Brusko Todd M, Chernigovskaya Maria, Christley Scott, Cowell Lindsay G, Frank Robert, Grytten Ivar, Gundersen Sveinung, Haff Ingrid Hobæk, Hovig Eivind, Hsieh Ping-Han, Klambauer Günter, Kuijjer Marieke L, Lund-Andersen Christin, Martini Antonio, Minotto Thomas, Pensar Johan, Rand Knut, Riccardi Enrico, Robert Philippe A, Rocha Artur, Slabodkin Andrei, Snapkov Igor, Sollid Ludvig M, Titov Dmytro, Weber Cédric R, Widrich Michael, Yaari Gur, Greiff Victor, Sandve Geir Kjetil
Department of Informatics, University of Oslo, Norway.
Centre for Bioinformatics, University of Oslo, Norway.
Nat Mach Intell. 2021 Nov;3(11):936-944. doi: 10.1038/s42256-021-00413-z. Epub 2021 Nov 16.
Adaptive immune receptor repertoires (AIRR) are key targets for biomedical research as they record past and ongoing adaptive immune responses. The capacity of machine learning (ML) to identify complex discriminative sequence patterns renders it an ideal approach for AIRR-based diagnostic and therapeutic discovery. To date, widespread adoption of AIRR ML has been inhibited by a lack of reproducibility, transparency, and interoperability. immuneML (immuneml.uio.no) addresses these concerns by implementing each step of the AIRR ML process in an extensible, open-source software ecosystem that is based on fully specified and shareable workflows. To facilitate widespread user adoption, immuneML is available as a command-line tool and through an intuitive Galaxy web interface, and extensive documentation of workflows is provided. We demonstrate the broad applicability of immuneML by (i) reproducing a large-scale study on immune state prediction, (ii) developing, integrating, and applying a novel deep learning method for antigen specificity prediction, and (iii) showcasing streamlined interpretability-focused benchmarking of AIRR ML.
适应性免疫受体组库(AIRR)是生物医学研究的关键靶点,因为它们记录了过去和正在进行的适应性免疫反应。机器学习(ML)识别复杂判别序列模式的能力使其成为基于AIRR的诊断和治疗发现的理想方法。迄今为止,由于缺乏可重复性、透明度和互操作性,AIRR ML的广泛应用受到了抑制。immuneML(immuneml.uio.no)通过在基于完全指定和可共享工作流程的可扩展开源软件生态系统中实施AIRR ML过程的每个步骤来解决这些问题。为了促进用户广泛采用,immuneML可作为命令行工具使用,并通过直观的Galaxy网络界面提供,同时还提供了工作流程的详细文档。我们通过以下方式证明了immuneML的广泛适用性:(i)重现一项关于免疫状态预测的大规模研究;(ii)开发、整合并应用一种用于抗原特异性预测的新型深度学习方法;(iii)展示以简化的可解释性为重点的AIRR ML基准测试。