• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

新型集成特征选择方法及其在免疫组库测序数据中的应用

Novel Ensemble Feature Selection Approach and Application in Repertoire Sequencing Data.

作者信息

He Tao, Baik Jason Min, Kato Chiemi, Yang Hai, Fan Zenghua, Cham Jason, Zhang Li

机构信息

Department of Mathematics, San Francisco State University, San Francisco, CA, United States.

Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States.

出版信息

Front Genet. 2022 Apr 26;13:821832. doi: 10.3389/fgene.2022.821832. eCollection 2022.

DOI:10.3389/fgene.2022.821832
PMID:35559031
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9086194/
Abstract

The T and B cell repertoire make up the adaptive immune system and is mainly generated through somatic V(D)J gene recombination. Thus, the VJ gene usage may be a potential prognostic or predictive biomarker. However, analysis of the adaptive immune system is challenging due to the heterogeneity of the clonotypes that make up the repertoire. To address the heterogeneity of the T and B cell repertoire, we proposed a novel ensemble feature selection approach and customized statistical learning algorithm focusing on the VJ gene usage. We applied the proposed approach to T cell receptor sequences from recovered COVID-19 patients and healthy donors, as well as a group of lung cancer patients who received immunotherapy. Our approach identified distinct VJ genes used in the COVID-19 recovered patients comparing to the healthy donors and the VJ genes associated with the clinical response in the lung cancer patients. Simulation studies show that the ensemble feature selection approach outperformed other state-of-the-art feature selection methods based on both efficiency and accuracy. It consistently yielded higher stability and sensitivity with lower false discovery rates. When integrated with different classification methods, the ensemble feature selection approach had the best prediction accuracy. In conclusion, the proposed novel approach and the integration procedure is an effective feature selection technique to aid in correctly classifying different subtypes to better understand the signatures in the adaptive immune response associated with disease or the treatment in order to improve treatment strategies.

摘要

T细胞和B细胞库构成了适应性免疫系统,主要通过体细胞V(D)J基因重组产生。因此,VJ基因的使用可能是一种潜在的预后或预测生物标志物。然而,由于构成库的克隆型的异质性,对适应性免疫系统的分析具有挑战性。为了解决T细胞和B细胞库的异质性问题,我们提出了一种新颖的集成特征选择方法和定制的统计学习算法,重点关注VJ基因的使用。我们将所提出的方法应用于康复的COVID-19患者、健康供体以及一组接受免疫治疗的肺癌患者的T细胞受体序列。我们的方法识别出与健康供体相比,康复的COVID-19患者中使用的不同VJ基因,以及与肺癌患者临床反应相关的VJ基因。模拟研究表明,集成特征选择方法在效率和准确性方面均优于其他现有先进特征选择方法。它始终具有更高的稳定性和敏感性,且错误发现率更低。当与不同的分类方法相结合时,集成特征选择方法具有最佳的预测准确性。总之,所提出的新方法和集成程序是一种有效的特征选择技术,有助于正确分类不同亚型,以更好地理解与疾病或治疗相关的适应性免疫反应特征,从而改进治疗策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/ce10ff16f436/fgene-13-821832-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/6c9a2de5d9a0/fgene-13-821832-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/8f0f2d1d1afe/fgene-13-821832-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/d8099ac8f41b/fgene-13-821832-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/b4420a5430c4/fgene-13-821832-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/ce10ff16f436/fgene-13-821832-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/6c9a2de5d9a0/fgene-13-821832-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/8f0f2d1d1afe/fgene-13-821832-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/d8099ac8f41b/fgene-13-821832-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/b4420a5430c4/fgene-13-821832-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/ce10ff16f436/fgene-13-821832-g005.jpg

相似文献

1
Novel Ensemble Feature Selection Approach and Application in Repertoire Sequencing Data.新型集成特征选择方法及其在免疫组库测序数据中的应用
Front Genet. 2022 Apr 26;13:821832. doi: 10.3389/fgene.2022.821832. eCollection 2022.
2
VJ Segment Usage of TCR-Beta Repertoire in Monozygotic Cystic Fibrosis Twins.单卵双胞胎囊性纤维化患者TCR-β库中VJ片段的使用情况
Front Immunol. 2021 Feb 23;12:599133. doi: 10.3389/fimmu.2021.599133. eCollection 2021.
3
3D: diversity, dynamics, differential testing - a proposed pipeline for analysis of next-generation sequencing T cell repertoire data.3D:多样性、动态性、差异测试——一种用于分析下一代测序T细胞受体库数据的提议流程
BMC Bioinformatics. 2017 Feb 27;18(1):129. doi: 10.1186/s12859-017-1544-9.
4
Multi-scale supervised clustering-based feature selection for tumor classification and identification of biomarkers and targets on genomic data.基于多尺度监督聚类的特征选择在肿瘤分类和基因组数据的生物标志物和靶标鉴定中的应用。
BMC Genomics. 2020 Sep 22;21(1):650. doi: 10.1186/s12864-020-07038-3.
5
Comprehensive analysis of TCR repertoire in COVID-19 using single cell sequencing.利用单细胞测序技术全面分析 COVID-19 中的 TCR 库。
Genomics. 2021 Mar;113(2):456-462. doi: 10.1016/j.ygeno.2020.12.036. Epub 2020 Dec 28.
6
An Ensemble Feature Selection Method for Biomarker Discovery.一种用于生物标志物发现的集成特征选择方法。
Proc IEEE Int Symp Signal Proc Inf Tech. 2017 Dec;2017:416-421. doi: 10.1109/ISSPIT.2017.8388679. Epub 2018 Jun 21.
7
A structured combination of ensemble classifier and filter-based feature selection to improve breast cancer diagnosis.基于集成分类器和基于过滤器的特征选择的结构化组合,以提高乳腺癌诊断。
J Cancer Res Clin Oncol. 2023 Nov;149(16):14519-14534. doi: 10.1007/s00432-023-05238-4. Epub 2023 Aug 12.
8
Compensation of feature selection biases accompanied with improved predictive performance for binary classification by using a novel ensemble feature selection approach.通过使用一种新颖的集成特征选择方法来补偿特征选择偏差并提高二元分类的预测性能。
BioData Min. 2016 Nov 18;9:36. doi: 10.1186/s13040-016-0114-4. eCollection 2016.
9
Ensemble of heterogeneous classifiers for diagnosis and prediction of coronary artery disease with reduced feature subset.用于冠状动脉疾病诊断和预测的具有简化特征子集的异构分类器集成
Comput Methods Programs Biomed. 2021 Jan;198:105770. doi: 10.1016/j.cmpb.2020.105770. Epub 2020 Sep 30.
10
Comparison of methods for the detection of outliers and associated biomarkers in mislabeled omics data.比较用于检测组学数据中标记错误的异常值和相关生物标志物的方法。
BMC Bioinformatics. 2020 Aug 14;21(1):357. doi: 10.1186/s12859-020-03653-9.

引用本文的文献

1
A robust ensemble feature selection approach to prioritize genes associated with survival outcome in high-dimensional gene expression data.一种强大的集成特征选择方法,用于在高维基因表达数据中对与生存结果相关的基因进行优先级排序。
Front Syst Biol. 2024;4. doi: 10.3389/fsysb.2024.1355595. Epub 2024 Mar 20.

本文引用的文献

1
Characterization of Circulating T Cell Receptor Repertoire Provides Information about Clinical Outcome after PD-1 Blockade in Advanced Non-Small Cell Lung Cancer Patients.循环T细胞受体库的特征分析为晚期非小细胞肺癌患者PD-1阻断后的临床结局提供信息。
Cancers (Basel). 2021 Jun 12;13(12):2950. doi: 10.3390/cancers13122950.
2
Early changes in the circulating T cells are associated with clinical outcomes after PD-L1 blockade by durvalumab in advanced NSCLC patients.早期循环 T 细胞的变化与晚期 NSCLC 患者接受 durvalumab 阻断 PD-L1 后的临床结局相关。
Cancer Immunol Immunother. 2021 Jul;70(7):2095-2102. doi: 10.1007/s00262-020-02833-z. Epub 2021 Jan 9.
3
Comprehensive analysis of TCR repertoire in COVID-19 using single cell sequencing.
利用单细胞测序技术全面分析 COVID-19 中的 TCR 库。
Genomics. 2021 Mar;113(2):456-462. doi: 10.1016/j.ygeno.2020.12.036. Epub 2020 Dec 28.
4
Next-Generation Sequencing of T and B Cell Receptor Repertoires from COVID-19 Patients Showed Signatures Associated with Severity of Disease.从 COVID-19 患者中 T 细胞和 B 细胞受体文库的下一代测序显示与疾病严重程度相关的特征。
Immunity. 2020 Aug 18;53(2):442-455.e4. doi: 10.1016/j.immuni.2020.06.024. Epub 2020 Jun 30.
5
Combination immunotherapy induces distinct T-cell repertoire responses when administered to patients with different malignancies.联合免疫疗法在治疗不同恶性肿瘤患者时会引起不同的 T 细胞受体反应。
J Immunother Cancer. 2020 May;8(1). doi: 10.1136/jitc-2019-000368.
6
Characterization of Distinct T Cell Receptor Repertoires in Tumor and Distant Non-tumor Tissues from Lung Cancer Patients.从肺癌患者的肿瘤和远处非肿瘤组织中鉴定出独特的 T 细胞受体库。
Genomics Proteomics Bioinformatics. 2019 Jun;17(3):287-296. doi: 10.1016/j.gpb.2018.10.005. Epub 2019 Aug 31.
7
Computational Strategies for Dissecting the High-Dimensional Complexity of Adaptive Immune Repertoires.计算策略解析适应性免疫受体的高维复杂性。
Front Immunol. 2018 Feb 21;9:224. doi: 10.3389/fimmu.2018.00224. eCollection 2018.
8
Evaluation of variable selection methods for random forests and omics data sets.随机森林和组学数据集变量选择方法的评估。
Brief Bioinform. 2019 Mar 22;20(2):492-503. doi: 10.1093/bib/bbx124.
9
Cross-validation failure: Small sample sizes lead to large error bars.交叉验证失败:样本量小导致误差幅度大。
Neuroimage. 2018 Oct 15;180(Pt A):68-77. doi: 10.1016/j.neuroimage.2017.06.061. Epub 2017 Jun 24.
10
Probability machines: consistent probability estimation using nonparametric learning machines.概率机器:使用非参数学习机器进行一致概率估计。
Methods Inf Med. 2012;51(1):74-81. doi: 10.3414/ME00-01-0052. Epub 2011 Sep 14.