从高通量测序中识别T细胞受体:应对TCRα和TCRβ配对中的混杂情况
Identifying T Cell Receptors from High-Throughput Sequencing: Dealing with Promiscuity in TCRα and TCRβ Pairing.
作者信息
Lee Edward S, Thomas Paul G, Mold Jeff E, Yates Andrew J
机构信息
Institute of Infection, Immunity & Inflammation, Glasgow Biomedical Research Centre, University of Glasgow, Glasgow, United Kingdom.
St. Jude Children's Research Hospital, Memphis, Tennessee, United States of America.
出版信息
PLoS Comput Biol. 2017 Jan 19;13(1):e1005313. doi: 10.1371/journal.pcbi.1005313. eCollection 2017 Jan.
Characterisation of the T cell receptors (TCR) involved in immune responses is important for the design of vaccines and immunotherapies for cancer and autoimmune disease. The specificity of the interaction between the TCR heterodimer and its peptide-MHC ligand derives largely from the juxtaposed hypervariable CDR3 regions on the TCRα and TCRβ chains, and obtaining the paired sequences of these regions is a standard for functionally defining the TCR. A brute force approach to identifying the TCRs in a population of T cells is to use high-throughput single-cell sequencing, but currently this process remains costly and risks missing small clones. Alternatively, CDR3α and CDR3β sequences can be associated using their frequency of co-occurrence in independent samples, but this approach can be confounded by the sharing of CDR3α and CDR3β across clones, commonly observed within epitope-specific T cell populations. The accurate, exhaustive, and economical recovery of TCR sequences from such populations therefore remains a challenging problem. Here we describe an algorithm for performing frequency-based pairing (alphabetr) that accommodates CDR3α- and CDR3β-sharing, cells expressing two TCRα chains, and multiple forms of sequencing error. The algorithm also yields accurate estimates of clonal frequencies.
鉴定参与免疫反应的T细胞受体(TCR)对于设计针对癌症和自身免疫性疾病的疫苗及免疫疗法至关重要。TCR异二聚体与其肽-MHC配体之间相互作用的特异性在很大程度上源于TCRα和TCRβ链上并列的高变CDR3区域,获得这些区域的配对序列是从功能上定义TCR的标准。在一群T细胞中鉴定TCR的一种强力方法是使用高通量单细胞测序,但目前这个过程仍然成本高昂,并且有遗漏小克隆的风险。或者,CDR3α和CDR3β序列可以利用它们在独立样本中的共现频率进行关联,但这种方法可能会因克隆间CDR3α和CDR3β的共享而混淆,这在表位特异性T细胞群体中很常见。因此,从这类群体中准确、详尽且经济地恢复TCR序列仍然是一个具有挑战性的问题。在这里,我们描述了一种用于执行基于频率配对(alphabetr)的算法,该算法考虑了CDR3α和CDR3β的共享、表达两条TCRα链的细胞以及多种形式的测序错误。该算法还能准确估计克隆频率。