Department of Computer Science, Dartmouth, Hanover, NH, USA.
BMC Bioinformatics. 2019 May 15;20(1):241. doi: 10.1186/s12859-019-2864-8.
Repertoire sequencing is enabling deep explorations into the cellular immune response, including the characterization of commonalities and differences among T cell receptor (TCR) repertoires from different individuals, pathologies, and antigen specificities. In seeking to understand the generality of patterns observed in different groups of TCRs, it is necessary to balance how well each pattern represents the diversity among TCRs from one group (sensitivity) vs. how many TCRs from other groups it also represents (specificity). The variable complementarity determining regions (CDRs), particularly the third CDRs (CDR3s) interact with major histocompatibility complex (MHC)-presented epitopes from putative antigens, and thus encode the determinants of recognition.
We here systematically characterize the predictive power that can be obtained from CDR3 sequences, using representative, readily interpretable methods for evaluating CDR sequence similarity and then clustering and classifying sequences based on similarity. An initial analysis of CDR3s of known structure, clustered by structural similarity, helps calibrate the limits of sequence diversity among CDRs that might have a common mode of interaction with presented epitopes. Subsequent analyses demonstrate that this same range of sequence similarity strikes a favorable specificity/sensitivity balance in distinguishing twins from non-twins based on overall CDR3 repertoires, classifying CDR3 repertoires by antigen specificity, and distinguishing general pathologies.
We conclude that within a fairly broad range of sequence similarity, matching CDR3 sequences are likely to share specificities.
受体库测序使我们能够深入探索细胞免疫反应,包括鉴定来自不同个体、不同病理和不同抗原特异性的 T 细胞受体 (TCR) 受体库之间的共性和差异。为了了解不同 TCR 组中观察到的模式的普遍性,有必要平衡每个模式在多大程度上代表一组 TCR 的多样性(敏感性),以及它还代表多少其他组的 TCR(特异性)。可变互补决定区(CDRs),特别是第三 CDR(CDR3)与主要组织相容性复合物(MHC)呈递的来自假定抗原的表位相互作用,因此编码识别的决定因素。
我们在这里使用代表易于解释的方法系统地描述了从 CDR3 序列中获得的预测能力,这些方法用于评估 CDR 序列相似性,然后基于相似性对序列进行聚类和分类。对结构相似性聚类的已知结构 CDR3 的初步分析有助于校准与呈递表位具有共同相互作用模式的 CDR 之间可能存在的序列多样性的极限。随后的分析表明,在基于总体 CDR3 受体库区分双胞胎和非双胞胎、根据抗原特异性对 CDR3 受体库进行分类以及区分一般病理方面,这种相同的序列相似性范围在区分特异性/敏感性方面取得了有利的平衡。
我们得出结论,在相当广泛的序列相似性范围内,匹配的 CDR3 序列可能具有特定的特异性。