Andreani Tommaso, Slot Linda M, Gabillard Samuel, Strübing Carsten, Reimertz Claus, Yaligara Veeranagouda, Bakker Aleida M, Olfati-Saber Reza, Toes René E M, Scherer Hans U, Augé Franck, Šimaitė Deimantė
AI & Deep Analytics-Omics Data Science, Sanofi, Frankfurt am Main 65926, Germany.
Department of Rheumatology, Leiden University Medical Center, 2333 RC Leiden, The Netherlands.
NAR Genom Bioinform. 2022 Jul 13;4(3):lqac049. doi: 10.1093/nargab/lqac049. eCollection 2022 Sep.
Multiple methods have recently been developed to reconstruct full-length B-cell receptors (BCRs) from single-cell RNA sequencing (scRNA-seq) data. This need emerged from the expansion of scRNA-seq techniques, the increasing interest in antibody-based drug development and the importance of BCR repertoire changes in cancer and autoimmune disease progression. However, a comprehensive assessment of performance-influencing factors such as the sequencing depth, read length or number of somatic hypermutations (SHMs) as well as guidance regarding the choice of methodology is still lacking. In this work, we evaluated the ability of six available methods to reconstruct full-length BCRs using one simulated and three experimental SMART-seq datasets. In addition, we validated that the BCRs assembled recognize their intended targets when expressed as monoclonal antibodies. We observed that methods such as BALDR, BASIC and BRACER showed the best overall performance across the tested datasets and conditions, whereas only BASIC demonstrated acceptable results on very short read libraries. Furthermore, the assembly-based methods BRACER and BALDR were the most accurate in reconstructing BCRs harboring different degrees of SHMs in the variable domain, while TRUST4, MiXCR and BASIC were the fastest. Finally, we propose guidelines to select the best method based on the given data characteristics.
最近已经开发出多种方法,用于从单细胞RNA测序(scRNA-seq)数据中重建全长B细胞受体(BCR)。对scRNA-seq技术的拓展、基于抗体的药物开发兴趣的增加以及BCR库变化在癌症和自身免疫性疾病进展中的重要性,催生了这一需求。然而,目前仍缺乏对测序深度、读长或体细胞超突变(SHM)数量等影响性能的因素进行全面评估,也缺乏关于方法选择的指导。在这项工作中,我们使用一个模拟数据集和三个实验性SMART-seq数据集,评估了六种现有方法重建全长BCR的能力。此外,我们验证了组装得到的BCR在作为单克隆抗体表达时能够识别其预期靶点。我们观察到,在测试的数据集和条件下,BALDR、BASIC和BRACER等方法总体表现最佳,而只有BASIC在极短读长文库上显示出可接受的结果。此外,基于组装的方法BRACER和BALDR在重建可变域中具有不同程度SHM的BCR时最为准确,而TRUST4、MiXCR和BASIC速度最快。最后,我们提出了根据给定数据特征选择最佳方法的指导原则。