Suppr超能文献

从单细胞数据中对高置信度表型亚群进行监督学习。

Supervised learning of high-confidence phenotypic subpopulations from single-cell data.

作者信息

Ren Tao, Chen Canping, Danilov Alexey V, Liu Susan, Guan Xiangnan, Du Shunyi, Wu Xiwei, Sherman Mara H, Spellman Paul T, Coussens Lisa M, Adey Andrew C, Mills Gordon B, Wu Ling-Yun, Xia Zheng

机构信息

Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.

School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China.

出版信息

bioRxiv. 2023 Mar 25:2023.03.23.533712. doi: 10.1101/2023.03.23.533712.

Abstract

Accurately identifying phenotype-relevant cell subsets from heterogeneous cell populations is crucial for delineating the underlying mechanisms driving biological or clinical phenotypes. Here, by deploying a learning with rejection strategy, we developed a novel supervised learning framework called PENCIL to identify subpopulations associated with categorical or continuous phenotypes from single-cell data. By embedding a feature selection function into this flexible framework, for the first time, we were able to select informative features and identify cell subpopulations simultaneously, which enables the accurate identification of phenotypic subpopulations otherwise missed by methods incapable of concurrent gene selection. Furthermore, the regression mode of PENCIL presents a novel ability for supervised phenotypic trajectory learning of subpopulations from single-cell data. We conducted comprehensive simulations to evaluate PENCIĽs versatility in simultaneous gene selection, subpopulation identification and phenotypic trajectory prediction. PENCIL is fast and scalable to analyze 1 million cells within 1 hour. Using the classification mode, PENCIL detected T-cell subpopulations associated with melanoma immunotherapy outcomes. Moreover, when applied to scRNA-seq of a mantle cell lymphoma patient with drug treatment across multiple time points, the regression mode of PENCIL revealed a transcriptional treatment response trajectory. Collectively, our work introduces a scalable and flexible infrastructure to accurately identify phenotype-associated subpopulations from single-cell data.

摘要

从异质细胞群体中准确识别与表型相关的细胞亚群对于阐明驱动生物学或临床表型的潜在机制至关重要。在此,通过采用带拒绝策略的学习方法,我们开发了一种名为PENCIL的新型监督学习框架,用于从单细胞数据中识别与分类或连续表型相关的亚群。通过将特征选择功能嵌入到这个灵活的框架中,我们首次能够同时选择信息特征并识别细胞亚群,这使得能够准确识别那些无法同时进行基因选择的方法所遗漏的表型亚群。此外,PENCIL的回归模式展现了从单细胞数据中对亚群进行监督表型轨迹学习的新能力。我们进行了全面的模拟,以评估PENCIL在同时进行基因选择、亚群识别和表型轨迹预测方面的通用性。PENCIL速度快且可扩展,能够在1小时内分析100万个细胞。使用分类模式,PENCIL检测到了与黑色素瘤免疫治疗结果相关的T细胞亚群。此外,当应用于一名接受多个时间点药物治疗的套细胞淋巴瘤患者的scRNA-seq时,PENCIL的回归模式揭示了转录治疗反应轨迹。总的来说,我们的工作引入了一种可扩展且灵活的架构,以从单细胞数据中准确识别与表型相关的亚群。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5607/10055361/b6b6b0ee08c2/nihpp-2023.03.23.533712v1-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验