The Alexander Kofkin Faculty of Engineering, Bar Ilan University, 5290002, Ramat Gan, Israel.
Bar Ilan Institute of Nanotechnology and Advanced Materials, Bar Ilan University, 5290002, Ramat Gan, Israel.
Genome Res. 2023 Jan;33(1):71-79. doi: 10.1101/gr.276683.122. Epub 2022 Dec 16.
Crohn's disease (CD) is a chronic relapsing-remitting inflammatory disorder of the gastrointestinal tract that is characterized by altered innate and adaptive immune function. Although massively parallel sequencing studies of the T cell receptor repertoire identified oligoclonal expansion of unique clones, much less is known about the B cell receptor (BCR) repertoire in CD. Here, we present a novel BCR repertoire sequencing data set from ileal biopsies from pediatric patients with CD and controls, and identify CD-specific somatic hypermutation (SHM) patterns, revealed by a machine learning (ML) algorithm trained on BCR repertoire sequences. Moreover, ML classification of a different data set from blood samples of adults with CD versus controls identified that V gene usage, clusters, or mutation frequencies yielded excellent results in classifying the disease (F1 > 90%). In summary, we show that an ML algorithm enables the classification of CD based on unique BCR repertoire features with high accuracy.
克罗恩病(CD)是一种慢性复发性炎症性胃肠道疾病,其特征是先天和适应性免疫功能改变。尽管对 T 细胞受体库的大规模平行测序研究确定了独特克隆的寡克隆扩增,但对 CD 中的 B 细胞受体(BCR)库知之甚少。在这里,我们展示了来自儿科 CD 患者和对照者回肠活检的新型 BCR 库测序数据集,并通过在 BCR 库序列上训练的机器学习(ML)算法鉴定了 CD 特异性体细胞高频突变(SHM)模式。此外,使用来自成人 CD 与对照者血液样本的不同数据集的 ML 分类表明,V 基因使用、聚类或突变频率在疾病分类中产生了极好的结果(F1>90%)。总之,我们表明,机器学习算法能够基于独特的 BCR 库特征以高精度对 CD 进行分类。