Suppr超能文献

利用机器学习在淋巴瘤 B 细胞受体库中检测疾病特异性标志物。

Detection of disease-specific signatures in B cell repertoires of lymphomas using machine learning.

机构信息

Department of Biomedicine, Translational Immuno-Oncology, University Hospital Basel, Basel, Switzerland.

Collaborative Research Institute Intelligent Oncology (CRIION), Freiburg, Germany.

出版信息

PLoS Comput Biol. 2024 Jul 2;20(7):e1011570. doi: 10.1371/journal.pcbi.1011570. eCollection 2024 Jul.

Abstract

The classification of B cell lymphomas-mainly based on light microscopy evaluation by a pathologist-requires many years of training. Since the B cell receptor (BCR) of the lymphoma clonotype and the microenvironmental immune architecture are important features discriminating different lymphoma subsets, we asked whether BCR repertoire next-generation sequencing (NGS) of lymphoma-infiltrated tissues in conjunction with machine learning algorithms could have diagnostic utility in the subclassification of these cancers. We trained a random forest and a linear classifier via logistic regression based on patterns of clonal distribution, VDJ gene usage and physico-chemical properties of the top-n most frequently represented clonotypes in the BCR repertoires of 620 paradigmatic lymphoma samples-nodular lymphocyte predominant B cell lymphoma (NLPBL), diffuse large B cell lymphoma (DLBCL) and chronic lymphocytic leukemia (CLL)-alongside with 291 control samples. With regard to DLBCL and CLL, the models demonstrated optimal performance when utilizing only the most prevalent clonotype for classification, while in NLPBL-that has a dominant background of non-malignant bystander cells-a broader array of clonotypes enhanced model accuracy. Surprisingly, the straightforward logistic regression model performed best in this seemingly complex classification problem, suggesting linear separability in our chosen dimensions. It achieved a weighted F1-score of 0.84 on a test cohort including 125 samples from all three lymphoma entities and 58 samples from healthy individuals. Together, we provide proof-of-concept that at least the 3 studied lymphoma entities can be differentiated from each other using BCR repertoire NGS on lymphoma-infiltrated tissues by a trained machine learning model.

摘要

B 细胞淋巴瘤的分类——主要基于病理学家的光学显微镜评估——需要多年的培训。由于淋巴瘤克隆型的 B 细胞受体 (BCR) 和微环境免疫结构是区分不同淋巴瘤亚群的重要特征,我们想知道淋巴瘤浸润组织的 BCR 库次世代定序 (NGS) 与机器学习算法结合是否可用于这些癌症的亚分类。我们通过基于克隆分布模式、VDJ 基因使用和 BCR 库中前-n 个最常代表克隆型的理化特性的逻辑回归,训练了随机森林和线性分类器,其中包括 620 个典型淋巴瘤样本(结节性淋巴细胞为主型 B 细胞淋巴瘤 (NLPBL)、弥漫性大 B 细胞淋巴瘤 (DLBCL) 和慢性淋巴细胞白血病 (CLL))和 291 个对照样本。关于 DLBCL 和 CLL,当仅使用最常见的克隆型进行分类时,模型表现出最佳性能,而在 NLPBL 中——其主要背景是良性旁观者细胞——更广泛的克隆型增强了模型准确性。令人惊讶的是,在这个看似复杂的分类问题中,简单的逻辑回归模型表现最佳,这表明我们选择的维度具有线性可分离性。它在包括来自所有三种淋巴瘤实体的 125 个样本和来自健康个体的 58 个样本的测试队列中实现了加权 F1 得分为 0.84。总的来说,我们提供了证据,即至少使用训练有素的机器学习模型对淋巴瘤浸润组织进行 BCR 库 NGS 可以区分至少 3 种研究中的淋巴瘤实体。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/46db/11249212/87532da513da/pcbi.1011570.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验