Suppr超能文献

高通量测序 T 细胞受体库:陷阱与机遇。

High-throughput sequencing of the T-cell receptor repertoire: pitfalls and opportunities.

机构信息

Massachusetts General Hospital, Boston, MA.

University College of London, Bloomsbury, UK.

出版信息

Brief Bioinform. 2018 Jul 20;19(4):554-565. doi: 10.1093/bib/bbw138.

Abstract

T-cell specificity is determined by the T-cell receptor, a heterodimeric protein coded for by an extremely diverse set of genes produced by imprecise somatic gene recombination. Massively parallel high-throughput sequencing allows millions of different T-cell receptor genes to be characterized from a single sample of blood or tissue. However, the extraordinary heterogeneity of the immune repertoire poses significant challenges for subsequent analysis of the data. We outline the major steps in processing of repertoire data, considering low-level processing of raw sequence files and high-level algorithms, which seek to extract biological or pathological information. The latest generation of bioinformatics tools allows millions of DNA sequences to be accurately and rapidly assigned to their respective variable V and J gene segments, and to reconstruct an almost error-free representation of the non-templated additions and deletions that occur. High-level processing can measure the diversity of the repertoire in different samples, quantify V and J usage and identify private and public T-cell receptors. Finally, we discuss the major challenge of linking T-cell receptor sequence to function, and specifically to antigen recognition. Sophisticated machine learning algorithms are being developed that can combine the paradoxical degeneracy and cross-reactivity of individual T-cell receptors with the specificity of the overall T-cell immune response. Computational analysis will provide the key to unlock the potential of the T-cell receptor repertoire to give insight into the fundamental biology of the adaptive immune system and to provide powerful biomarkers of disease.

摘要

T 细胞的特异性由 T 细胞受体决定,T 细胞受体是一种异二聚体蛋白,由通过非精确体细胞基因重组产生的极其多样化的基因编码。大规模平行高通量测序允许从单个血液或组织样本中鉴定出数百万种不同的 T 细胞受体基因。然而,免疫受体的巨大异质性给后续数据分析带来了重大挑战。我们概述了处理免疫受体数据的主要步骤,考虑了低级别的原始序列文件处理和高级别的算法,这些算法旨在提取生物学或病理学信息。最新一代的生物信息学工具允许将数百万个 DNA 序列准确快速地分配给它们各自的可变 V 和 J 基因片段,并重建几乎没有错误的模板添加和删除的表示。高级处理可以测量不同样本中免疫受体的多样性,量化 V 和 J 的使用情况,并识别私有和公共 T 细胞受体。最后,我们讨论了将 T 细胞受体序列与功能(特别是与抗原识别)联系起来的主要挑战。正在开发复杂的机器学习算法,这些算法可以将单个 T 细胞受体的悖论性简并性和交叉反应性与整体 T 细胞免疫反应的特异性结合起来。计算分析将是解锁 T 细胞受体库潜力的关键,使我们能够深入了解适应性免疫系统的基本生物学,并提供疾病的强大生物标志物。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7174/6054146/cd92caf270b9/bbw138f1p.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验