Suppr超能文献

新型集成特征选择方法及其在免疫组库测序数据中的应用

Novel Ensemble Feature Selection Approach and Application in Repertoire Sequencing Data.

作者信息

He Tao, Baik Jason Min, Kato Chiemi, Yang Hai, Fan Zenghua, Cham Jason, Zhang Li

机构信息

Department of Mathematics, San Francisco State University, San Francisco, CA, United States.

Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States.

出版信息

Front Genet. 2022 Apr 26;13:821832. doi: 10.3389/fgene.2022.821832. eCollection 2022.

Abstract

The T and B cell repertoire make up the adaptive immune system and is mainly generated through somatic V(D)J gene recombination. Thus, the VJ gene usage may be a potential prognostic or predictive biomarker. However, analysis of the adaptive immune system is challenging due to the heterogeneity of the clonotypes that make up the repertoire. To address the heterogeneity of the T and B cell repertoire, we proposed a novel ensemble feature selection approach and customized statistical learning algorithm focusing on the VJ gene usage. We applied the proposed approach to T cell receptor sequences from recovered COVID-19 patients and healthy donors, as well as a group of lung cancer patients who received immunotherapy. Our approach identified distinct VJ genes used in the COVID-19 recovered patients comparing to the healthy donors and the VJ genes associated with the clinical response in the lung cancer patients. Simulation studies show that the ensemble feature selection approach outperformed other state-of-the-art feature selection methods based on both efficiency and accuracy. It consistently yielded higher stability and sensitivity with lower false discovery rates. When integrated with different classification methods, the ensemble feature selection approach had the best prediction accuracy. In conclusion, the proposed novel approach and the integration procedure is an effective feature selection technique to aid in correctly classifying different subtypes to better understand the signatures in the adaptive immune response associated with disease or the treatment in order to improve treatment strategies.

摘要

T细胞和B细胞库构成了适应性免疫系统,主要通过体细胞V(D)J基因重组产生。因此,VJ基因的使用可能是一种潜在的预后或预测生物标志物。然而,由于构成库的克隆型的异质性,对适应性免疫系统的分析具有挑战性。为了解决T细胞和B细胞库的异质性问题,我们提出了一种新颖的集成特征选择方法和定制的统计学习算法,重点关注VJ基因的使用。我们将所提出的方法应用于康复的COVID-19患者、健康供体以及一组接受免疫治疗的肺癌患者的T细胞受体序列。我们的方法识别出与健康供体相比,康复的COVID-19患者中使用的不同VJ基因,以及与肺癌患者临床反应相关的VJ基因。模拟研究表明,集成特征选择方法在效率和准确性方面均优于其他现有先进特征选择方法。它始终具有更高的稳定性和敏感性,且错误发现率更低。当与不同的分类方法相结合时,集成特征选择方法具有最佳的预测准确性。总之,所提出的新方法和集成程序是一种有效的特征选择技术,有助于正确分类不同亚型,以更好地理解与疾病或治疗相关的适应性免疫反应特征,从而改进治疗策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/19b7/9086194/6c9a2de5d9a0/fgene-13-821832-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验