Suppr超能文献

消化系统癌症中的T细胞受体动力学:一种用于肿瘤诊断和分期的多层机器学习方法

T-cell receptor dynamics in digestive system cancers: a multi-layer machine learning approach for tumor diagnosis and staging.

作者信息

Yuan Changjin, Wang Bin, Wang Hong, Wang Fang, Li Xiangze, Zhen Ya'nan

机构信息

Clinical Laboratory, Affiliated Hospital of Shandong University of Traditional Chinese Medicine, Jinan, China.

Minimally Invasive Surgery, The Third Affiliated Hospital of Shandong First Medical University, Jinan, China.

出版信息

Front Immunol. 2025 Apr 8;16:1556165. doi: 10.3389/fimmu.2025.1556165. eCollection 2025.

Abstract

BACKGROUND

T-cell receptor (TCR) repertoires provide insights into tumor immunology, yet their variations across digestive system cancers are not well understood. Characterizing TCR differences between colorectal cancer (CRC) and gastric cancer (GC), as well as developing machine learning models to distinguish cancer types, metastatic status, and disease stages are crucial for guiding clinical practices.

METHODS

A cohort study of 143 tumor patients (96 CRC, 47 GC) was conducted. High-throughput TCR sequencing was performed to capture TCR beta (TRB), delta (TRD), and gamma (TRG) chain data. Tissue-specific patterns in TCR repertoire features, such as V-J gene recombination, complementarity-determining region 3 (CDR3) sequences, and motif distributions, were analyzed. Multi-layer machine learning-based diagnostic models were developed by leveraging motif-based feature and deep learning-based feature extraction using ProteinBERT from the 100 most abundant CDR3 sequences per sample. These models were used to differentiate CRC from GC, distinguish between primary and metastatic CRC lesions, and predict disease stages in CRC.

RESULTS

Tissue-specific differences in TCR repertoires were observed across CRC, GC, and between primary and metastatic lesions, as well as across disease stages in CRC. Distinct V-J gene recombination patterns were identified, with CRC showing enrichment in - combinations, while GC exhibited higher levels of γδT-cell-related recombination. Primary and metastatic lesions of CRC patients displayed distinct V-J recombination preferences (e.g., / higher in metastatic; / higher in primary) and CDR3 sequence differences, with metastatic having shorter TRG CDR3 lengths (-value = 0.019). Across CRC stages, later stages (III-IV) showed higher clonal diversity (-value < 0.05) and stage-specific V-J patterns, alongside distinct CDR3 amino acid preferences at N-terminal (positions 1-2) and central positions (positions 5-12). Multi-dimensional machine learning models demonstrated exceptional diagnostic performance across all classification tasks. For distinguishing CRC from GC, the model achieved an accuracy of 97.9% and an area under the curve (AUC) of 0.996. For differentiating primary from metastatic CRC, the model achieved 100% accuracy with an AUC of 1.000. In predicting CRC disease stages, the model attained an accuracy of 96.9% and an AUC of 0.993. Extensive validation using simulated and publicly available datasets, confirmed the robustness and reliability of the models, demonstrating consistent performance across diverse datasets and experimental conditions.

CONCLUSIONS

Our investigation provides novel insights into TCR repertoire variations in digestive system tumors, and highlight the potential of immune repertoire features as powerful diagnostic tools for understanding cancer progression and potentially improving clinical decision-making.

摘要

背景

T细胞受体(TCR)库为肿瘤免疫学提供了深入见解,但其在消化系统癌症中的变化尚未得到充分了解。明确结直肠癌(CRC)和胃癌(GC)之间的TCR差异,以及开发机器学习模型以区分癌症类型、转移状态和疾病阶段,对于指导临床实践至关重要。

方法

对143例肿瘤患者(96例CRC,47例GC)进行了队列研究。进行高通量TCR测序以获取TCRβ(TRB)、δ(TRD)和γ(TRG)链数据。分析了TCR库特征中的组织特异性模式,如V-J基因重组、互补决定区3(CDR3)序列和基序分布。通过利用基于基序的特征和使用ProteinBERT从每个样本中100个最丰富的CDR3序列进行基于深度学习的特征提取,开发了基于多层机器学习的诊断模型。这些模型用于区分CRC和GC,区分CRC的原发性和转移性病变,并预测CRC的疾病阶段。

结果

在CRC、GC以及原发性和转移性病变之间,以及CRC的不同疾病阶段,均观察到TCR库的组织特异性差异。识别出了不同的V-J基因重组模式,CRC显示*-*组合富集,而GC表现出更高水平的γδT细胞相关重组。CRC患者的原发性和转移性病变表现出不同的V-J重组偏好(例如,/在转移性中更高;/在原发性中更高)和CDR3序列差异,转移性病变的TRG CDR3长度较短(-值 = 0.019)。在CRC各阶段中,晚期(III-IV期)显示出更高的克隆多样性(-值 < 0.05)和阶段特异性V-J模式,同时在N端(第1-2位)和中心位置(第5-12位)有不同的CDR3氨基酸偏好。多维机器学习模型在所有分类任务中均表现出卓越的诊断性能。对于区分CRC和GC,该模型的准确率达到97.9%,曲线下面积(AUC)为0.996。对于区分CRC的原发性和转移性病变,该模型的准确率达到100%,AUC为1.000。在预测CRC疾病阶段时,该模型的准确率达到96.9%,AUC为0.993。使用模拟和公开可用数据集进行的广泛验证,证实了模型的稳健性和可靠性,表明在不同数据集和实验条件下性能一致。

结论

我们的研究为消化系统肿瘤中TCR库的变化提供了新的见解,并突出了免疫库特征作为理解癌症进展和潜在改善临床决策的强大诊断工具的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6b0/12011560/35cbed0fb655/fimmu-16-1556165-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验