University of Washington, Seattle, United States.
Fred Hutchinson Cancer Research Center, Seattle, United States.
Elife. 2019 Sep 5;8:e46935. doi: 10.7554/eLife.46935.
Probabilistic models of adaptive immune repertoire sequence distributions can be used to infer the expansion of immune cells in response to stimulus, differentiate genetic from environmental factors that determine repertoire sharing, and evaluate the suitability of various target immune sequences for stimulation via vaccination. Classically, these models are defined in terms of a probabilistic V(D)J recombination model which is sometimes combined with a selection model. In this paper we take a different approach, fitting variational autoencoder (VAE) models parameterized by deep neural networks to T cell receptor (TCR) repertoires. We show that simple VAE models can perform accurate cohort frequency estimation, learn the rules of VDJ recombination, and generalize well to unseen sequences. Further, we demonstrate that VAE-like models can distinguish between real sequences and sequences generated according to a recombination-selection model, and that many characteristics of VAE-generated sequences are similar to those of real sequences.
适应性免疫受体序列分布的概率模型可用于推断免疫细胞对刺激的扩增,区分决定受体共享的遗传和环境因素,并评估各种目标免疫序列通过疫苗接种刺激的适宜性。传统上,这些模型是根据概率 V(D)J 重组模型定义的,该模型有时与选择模型结合使用。在本文中,我们采用了一种不同的方法,通过深度神经网络对 T 细胞受体 (TCR) 受体进行变分自编码器 (VAE) 模型拟合。我们表明,简单的 VAE 模型可以进行准确的队列频率估计,学习 VDJ 重组的规则,并很好地推广到未见的序列。此外,我们证明了 VAE 类似的模型可以区分真实序列和根据重组-选择模型生成的序列,并且 VAE 生成的序列的许多特征与真实序列相似。