一种用于去除 T 细胞受体测序数据污染的新统计方法。

A novel statistical method for decontaminating T-cell receptor sequencing data.

机构信息

Department of Biostatistics and Data Science, The University of Texas Health Science Center at Houston, 77030, Texas, Houston, USA.

Department of Biostatistics, The University of Texas MD Anderson Cancer Center, 77030, Texas, Houston, USA.

出版信息

Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad230.

DOI:10.1093/bib/bbad230

PMID:37337757

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10359082/

Abstract

The T-cell receptor (TCR) repertoire is highly diverse among the population and plays an essential role in initiating multiple immune processes. TCR sequencing (TCR-seq) has been developed to profile the T cell repertoire. Similar to other high-throughput experiments, contamination can happen during several steps of TCR-seq, including sample collection, preparation and sequencing. Such contamination creates artifacts in the data, leading to inaccurate or even biased results. Most existing methods assume 'clean' TCR-seq data as the starting point with no ability to handle data contamination. Here, we develop a novel statistical model to systematically detect and remove contamination in TCR-seq data. We summarize the observed contamination into two sources, pairwise and cross-cohort. For both sources, we provide visualizations and summary statistics to help users assess the severity of the contamination. Incorporating prior information from 14 existing TCR-seq datasets with minimum contamination, we develop a straightforward Bayesian model to statistically identify contaminated samples. We further provide strategies for removing the impacted sequences to allow for downstream analysis, thus avoiding any need to repeat experiments. Our proposed model shows robustness in contamination detection compared with a few off-the-shelf detection methods in simulation studies. We illustrate the use of our proposed method on two TCR-seq datasets generated locally.

摘要

T 细胞受体 (TCR) 库在人群中高度多样化，在启动多种免疫过程中发挥着重要作用。TCR 测序 (TCR-seq) 已被开发用于分析 T 细胞库。与其他高通量实验类似，TCR-seq 的多个步骤都可能发生污染，包括样本采集、准备和测序。这种污染会在数据中产生伪影，导致结果不准确甚至有偏差。大多数现有方法都假设“干净”的 TCR-seq 数据作为起点，无法处理数据污染。在这里，我们开发了一种新的统计模型来系统地检测和去除 TCR-seq 数据中的污染。我们将观察到的污染总结为两种来源，即成对和跨队列。对于这两种来源，我们提供可视化和汇总统计信息，以帮助用户评估污染的严重程度。我们结合了来自 14 个具有最小污染的现有 TCR-seq 数据集的先验信息，开发了一种简单的贝叶斯模型来从统计学上识别污染样本。我们进一步提供了去除受影响序列的策略，以便进行下游分析，从而避免重复实验的需要。与模拟研究中的几种现成检测方法相比，我们提出的模型在污染检测方面表现出稳健性。我们在本地生成的两个 TCR-seq 数据集上说明了我们提出的方法的使用。

相似文献

A novel statistical method for decontaminating T-cell receptor sequencing data.一种用于去除 T 细胞受体测序数据污染的新统计方法。

Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad230.

Rigorous benchmarking of T-cell receptor repertoire profiling methods for cancer RNA sequencing.对用于癌症 RNA 测序的 T 细胞受体谱分析方法进行严格的基准测试。

Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad220.

Enriching and Characterizing T Cell Repertoires from 3' Barcoded Single-Cell Whole Transcriptome Amplification Products.从 3' 条形码单细胞全转录组扩增产物中丰富和描述 T 细胞受体库。

Methods Mol Biol. 2022;2574:159-182. doi: 10.1007/978-1-0716-2712-9_7.

3D: diversity, dynamics, differential testing - a proposed pipeline for analysis of next-generation sequencing T cell repertoire data.3D：多样性、动态性、差异测试——一种用于分析下一代测序T细胞受体库数据的提议流程

BMC Bioinformatics. 2017 Feb 27;18(1):129. doi: 10.1186/s12859-017-1544-9.

An ultra-sensitive T-cell receptor detection method for TCR-Seq and RNA-Seq data.一种用于 TCR-Seq 和 RNA-Seq 数据的超高灵敏 T 细胞受体检测方法。

Bioinformatics. 2020 Aug 1;36(15):4255-4262. doi: 10.1093/bioinformatics/btaa432.

Clinical T Cell Receptor Repertoire Deep Sequencing and Analysis: An Application to Monitor Immune Reconstitution Following Cord Blood Transplantation.临床 T 细胞受体 repertoire 深度测序和分析：在监测脐血移植后免疫重建中的应用。

Front Immunol. 2018 Nov 5;9:2547. doi: 10.3389/fimmu.2018.02547. eCollection 2018.

RTCR: a pipeline for complete and accurate recovery of T cell repertoires from high throughput sequencing data.RTCR：一种用于从高通量测序数据中完整且准确地恢复T细胞受体库的流程。

Bioinformatics. 2016 Oct 15;32(20):3098-3106. doi: 10.1093/bioinformatics/btw339. Epub 2016 Jun 20.

[T cell receptor (TCR) repertoire and common analytical tools for high-throughput sequencing].[T细胞受体（TCR）库及高通量测序的常用分析工具]

Xi Bao Yu Fen Zi Mian Yi Xue Za Zhi. 2021 Sep;37(9):851-857.

A Framework for Annotation of Antigen Specificities in High-Throughput T-Cell Repertoire Sequencing Studies.高通量 T 细胞受体测序研究中抗原特异性注释的框架。

Front Immunol. 2019 Sep 26;10:2159. doi: 10.3389/fimmu.2019.02159. eCollection 2019.

T-Cell Receptor Repertoire Sequencing in the Era of Cancer Immunotherapy.T 细胞受体谱测序在肿瘤免疫治疗时代。

Clin Cancer Res. 2023 Mar 14;29(6):994-1008. doi: 10.1158/1078-0432.CCR-22-2469.

本文引用的文献

Response and recurrence correlates in individuals treated with neoadjuvant anti-PD-1 therapy for resectable oral cavity squamous cell carcinoma.新辅助抗 PD-1 治疗可切除口腔鳞状细胞癌患者的反应和复发相关因素。

Cell Rep Med. 2021 Oct 19;2(10):100411. doi: 10.1016/j.xcrm.2021.100411.

Transcriptional programs of neoantigen-specific TIL in anti-PD-1-treated lung cancers.抗 PD-1 治疗的肺癌中 neoantigen 特异性 TIL 的转录程序。

Nature. 2021 Aug;596(7870):126-132. doi: 10.1038/s41586-021-03752-4. Epub 2021 Jul 21.

Immune evolution from preneoplasia to invasive lung adenocarcinomas and underlying molecular features.从癌前病变到浸润性肺腺癌的免疫进化及其潜在分子特征。

Nat Commun. 2021 May 11;12(1):2722. doi: 10.1038/s41467-021-22890-x.

Lymphohematopoietic graft-versus-host responses promote mixed chimerism in patients receiving intestinal transplantation.淋巴细胞造血移植物抗宿主反应促进接受肠道移植患者的混合嵌合体形成。

J Clin Invest. 2021 Apr 15;131(8). doi: 10.1172/JCI141698.

Comprehensive analysis of TCR repertoire in COVID-19 using single cell sequencing.利用单细胞测序技术全面分析 COVID-19 中的 TCR 库。

Genomics. 2021 Mar;113(2):456-462. doi: 10.1016/j.ygeno.2020.12.036. Epub 2020 Dec 28.

Long-term Sculpting of the B-cell Repertoire following Cancer Immunotherapy in Patients Treated with Sipuleucel-T.癌症免疫疗法治疗患者中 Sipuleucel-T 治疗后 B 细胞 repertoire 的长期塑造。

Cancer Immunol Res. 2020 Dec;8(12):1496-1507. doi: 10.1158/2326-6066.CIR-20-0252. Epub 2020 Sep 23.

Single-cell TCR sequencing reveals phenotypically diverse clonally expanded cells harboring inducible HIV proviruses during ART.单细胞 TCR 测序揭示了在 ART 期间含有可诱导 HIV 前病毒的表型多样的克隆扩增细胞。

Nat Commun. 2020 Aug 14;11(1):4089. doi: 10.1038/s41467-020-17898-8.

A peripheral immune signature of responsiveness to PD-1 blockade in patients with classical Hodgkin lymphoma.经典型霍奇金淋巴瘤患者对 PD-1 阻断治疗反应的外周免疫特征。

Nat Med. 2020 Sep;26(9):1468-1479. doi: 10.1038/s41591-020-1006-1. Epub 2020 Aug 10.

Clonal replacement of tumor-specific T cells following PD-1 blockade.PD-1 阻断后肿瘤特异性 T 细胞的克隆性替换。

Nat Med. 2019 Aug;25(8):1251-1259. doi: 10.1038/s41591-019-0522-3. Epub 2019 Jul 29.

Radiotherapy induces responses of lung cancer to CTLA-4 blockade.放疗诱导肺癌对 CTLA-4 阻断的反应。

Nat Med. 2018 Dec;24(12):1845-1851. doi: 10.1038/s41591-018-0232-2. Epub 2018 Nov 5.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验