Suppr超能文献

B细胞受体体细胞超突变的节俭宽背景模型

Thrifty wide-context models of B cell receptor somatic hypermutation.

作者信息

Sung Kevin, Johnson Mackenzie M, Dumm Will, Simon Noah, Haddox Hugh, Fukuyama Julia, Matsen Frederick A

机构信息

Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, United States.

Department of Biostatistics, University of Washington, Seattle, United States.

出版信息

Elife. 2025 Aug 29;14:RP105471. doi: 10.7554/eLife.105471.

Abstract

Somatic hypermutation (SHM) is the diversity-generating process in antibody affinity maturation. Probabilistic models of SHM are needed for analyzing rare mutations, understanding the selective forces guiding affinity maturation, and understanding the underlying biochemical process. High-throughput data offers the potential to develop and fit models of SHM on relevant data sets. In this article, we model SHM using modern frameworks. We are motivated by recent work suggesting the importance of a wider context for SHM; however, assigning an independent rate to each k-mer leads to an exponential proliferation of parameters. Thus, using convolutions on 3-mer embeddings, we develop 'thrifty' models of SHM of various sizes; these can have fewer free parameters than a 5-mer model and yet have a significantly wider context. These offer a slight performance improvement over a 5-mer model, and other modern model elaborations worsen performance. We also find that a per-site effect is not necessary to explain SHM patterns given nucleotide context. Also, the two current methods for fitting an SHM model-on out-of-frame sequence data and on synonymous mutations-produce significantly different results, and augmenting out-of-frame data with synonymous mutations does not aid out-of-sample performance.

摘要

体细胞高频突变(SHM)是抗体亲和力成熟过程中产生多样性的过程。需要概率模型来分析罕见突变、理解指导亲和力成熟的选择力以及理解潜在的生化过程。高通量数据为在相关数据集上开发和拟合SHM模型提供了潜力。在本文中,我们使用现代框架对SHM进行建模。我们受到近期工作的启发,这些工作表明更广泛的背景对SHM很重要;然而,为每个k聚体分配独立的速率会导致参数呈指数级增长。因此,通过对三联体嵌入进行卷积,我们开发了各种大小的SHM“节俭”模型;这些模型的自由参数可能比五联体模型少,但背景却明显更广泛。与五联体模型相比,这些模型的性能略有提升,而其他现代模型改进则会降低性能。我们还发现,在给定核苷酸背景的情况下,不需要位点效应来解释SHM模式。此外,目前两种拟合SHM模型的方法——基于移码序列数据和同义突变——产生的结果显著不同,并且用同义突变扩充移码数据无助于样本外性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e63d/12396816/055fafb1499e/elife-105471-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验