scPanel：一种使用 scRNA-seq 数据集进行通用患者分类的自动识别稀疏基因面板的工具。

scPanel: a tool for automatic identification of sparse gene panels for generalizable patient classification using scRNA-seq datasets.

机构信息

Programme in Cardiovascular and Metabolic Disorders, Centre for Computational Biology, Duke-NUS Medical School, 8 College Road, Singapore 169857, Singapore.

The School of Mechanical and Aerospace Engineering and the School of Electrical and Electronic Engineering, Nanyang Technological University, 50 Nanyang Ave, Singapore 639798, Singapore.

出版信息

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae482.

DOI:10.1093/bib/bbae482

PMID:39350339

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11442147/

Abstract

Single-cell RNA sequencing (scRNA-seq) technologies can generate transcriptomic profiles at a single-cell resolution in large patient cohorts, facilitating discovery of gene and cellular biomarkers for disease. Yet, when the number of biomarker genes is large, the translation to clinical applications is challenging due to prohibitive sequencing costs. Here, we introduce scPanel, a computational framework designed to bridge the gap between biomarker discovery and clinical application by identifying a sparse gene panel for patient classification from the cell population(s) most responsive to perturbations (e.g. diseases/drugs). scPanel incorporates a data-driven way to automatically determine a minimal number of informative biomarker genes. Patient-level classification is achieved by aggregating the prediction probabilities of cells associated with a patient using the area under the curve score. Application of scPanel to scleroderma, colorectal cancer, and COVID-19 datasets resulted in high patient classification accuracy using only a small number of genes (<20), automatically selected from the entire transcriptome. In the COVID-19 case study, we demonstrated cross-dataset generalizability in predicting disease state in an external patient cohort. scPanel outperforms other state-of-the-art gene selection methods for patient classification and can be used to identify parsimonious sets of reliable biomarker candidates for clinical translation.

摘要

单细胞 RNA 测序 (scRNA-seq) 技术可以在大型患者队列中以单细胞分辨率生成转录组谱，有助于发现疾病的基因和细胞生物标志物。然而，当生物标志物基因数量较多时，由于测序成本过高，将其转化为临床应用具有挑战性。在这里，我们介绍了 scPanel，这是一种计算框架，旨在通过从对干扰（例如疾病/药物）最敏感的细胞群体中识别出用于患者分类的稀疏基因面板，来弥合生物标志物发现和临床应用之间的差距。scPanel 采用了一种数据驱动的方法，自动确定信息量最少的生物标志物基因的数量。通过使用曲线下面积评分来聚合与患者相关的细胞的预测概率，实现患者级别的分类。将 scPanel 应用于硬皮病、结直肠癌和 COVID-19 数据集，仅使用从整个转录组中自动选择的少数基因（<20 个），即可实现高精度的患者分类。在 COVID-19 案例研究中，我们证明了在外部患者队列中预测疾病状态的跨数据集泛化能力。scPanel 在患者分类方面优于其他最先进的基因选择方法，可用于确定用于临床转化的可靠生物标志物候选的简约集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ed4/11442147/3c40a5a704e2/bbae482f1.jpg

相似文献

scPanel: a tool for automatic identification of sparse gene panels for generalizable patient classification using scRNA-seq datasets.scPanel：一种使用 scRNA-seq 数据集进行通用患者分类的自动识别稀疏基因面板的工具。

Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae482.

XgCPred: Cell type classification using XGBoost-CNN integration and exploiting gene expression imaging in single-cell RNAseq data.XgCPred：基于 XGBoost-CNN 集成和单细胞 RNAseq 数据中基因表达成像的细胞类型分类。

Comput Biol Med. 2024 Oct;181:109066. doi: 10.1016/j.compbiomed.2024.109066. Epub 2024 Aug 24.

Robust identification of perturbed cell types in single-cell RNA-seq data.单细胞 RNA-seq 数据中扰动细胞类型的稳健识别。

Nat Commun. 2024 Sep 1;15(1):7610. doi: 10.1038/s41467-024-51649-3.

A hybrid deep clustering approach for robust cell type profiling using single-cell RNA-seq data.基于单细胞 RNA-seq 数据的混合深度聚类方法进行稳健的细胞类型分析。

RNA. 2020 Oct;26(10):1303-1319. doi: 10.1261/rna.074427.119. Epub 2020 Jun 12.

Identifying cell states in single-cell RNA-seq data at statistically maximal resolution.以统计学上最大分辨率识别单细胞 RNA-seq 数据中的细胞状态。

PLoS Comput Biol. 2024 Jul 12;20(7):e1012224. doi: 10.1371/journal.pcbi.1012224. eCollection 2024 Jul.

On the use of QDE-SVM for gene feature selection and cell type classification from scRNA-seq data.基于 QDE-SVM 的 scRNA-seq 数据基因特征选择和细胞类型分类方法。

PLoS One. 2023 Oct 19;18(10):e0292961. doi: 10.1371/journal.pone.0292961. eCollection 2023.

Identification of Marker Genes in Infectious Diseases from ScRNA-seq Data Using Interpretable Machine Learning.基于可解释机器学习的单细胞 RNA-seq 数据中传染病相关标记基因的鉴定。

Int J Mol Sci. 2024 May 29;25(11):5920. doi: 10.3390/ijms25115920.

Scanorama: integrating large and diverse single-cell transcriptomic datasets.Scanorama：整合大型和多样化的单细胞转录组数据集。

Nat Protoc. 2024 Aug;19(8):2283-2297. doi: 10.1038/s41596-024-00991-3. Epub 2024 Jun 6.

scBoolSeq: Linking scRNA-seq statistics and Boolean dynamics.scBoolSeq：将 scRNA-seq 统计与布尔动力学联系起来。

PLoS Comput Biol. 2024 Jul 8;20(7):e1011620. doi: 10.1371/journal.pcbi.1011620. eCollection 2024 Jul.

scZAG: Integrating ZINB-Based Autoencoder with Adaptive Data Augmentation Graph Contrastive Learning for scRNA-seq Clustering.scZAG：基于 ZINB 的自动编码器与自适应数据增强图对比学习在 scRNA-seq 聚类中的整合。

Int J Mol Sci. 2024 May 29;25(11):5976. doi: 10.3390/ijms25115976.

本文引用的文献

Comparative Analysis of Single-Cell RNA Sequencing Methods with and without Sample Multiplexing.单细胞 RNA 测序方法在有和没有样品多路复用情况下的比较分析。

Int J Mol Sci. 2024 Mar 29;25(7):3828. doi: 10.3390/ijms25073828.

Minimal gene set discovery in single-cell mRNA-seq datasets with ActiveSVM.利用主动支持向量机在单细胞mRNA测序数据集中发现最小基因集

Nat Comput Sci. 2022 Jun;2(6):387-398. doi: 10.1038/s43588-022-00263-8. Epub 2022 Jun 27.

Proteomic biomarkers for survival in systemic sclerosis-associated pulmonary hypertension.用于系统性硬皮病相关肺动脉高压生存的蛋白质组学生物标志物。

Respir Res. 2023 Nov 7;24(1):273. doi: 10.1186/s12931-023-02578-0.

ProtoCell4P: an explainable prototype-based neural network for patient classification using single-cell RNA-seq.ProtoCell4P：一种基于原型的可解释神经网络，用于使用单细胞 RNA-seq 进行患者分类。

Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad493.

scDeepInsight: a supervised cell-type identification method for scRNA-seq data with deep learning.scDeepInsight：一种基于深度学习的 scRNA-seq 数据有监督细胞类型识别方法。

Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad266.

transcription is an early predictor for COVID-19 outcomes, a multi-cohort observational study.转录是 COVID-19 结局的早期预测指标：一项多队列观察性研究。

Front Immunol. 2023 Jan 5;13:1060438. doi: 10.3389/fimmu.2022.1060438. eCollection 2022.

Single-cell and bulk transcriptome sequencing identifies two epithelial tumor cell states and refines the consensus molecular classification of colorectal cancer.单细胞和批量转录组测序确定了两种上皮肿瘤细胞状态，并完善了结直肠癌的共识分子分类。

Nat Genet. 2022 Jul;54(7):963-975. doi: 10.1038/s41588-022-01100-4. Epub 2022 Jun 30.

findPC: An R package to automatically select the number of principal components in single-cell analysis.findPC：一个用于在单细胞分析中自动选择主成分数量的 R 包。

Bioinformatics. 2022 May 13;38(10):2949-2951. doi: 10.1093/bioinformatics/btac235.

LGR5 expressing skin fibroblasts define a major cellular hub perturbed in scleroderma.LGR5 表达的皮肤成纤维细胞定义了硬皮病中受干扰的主要细胞枢纽。

Cell. 2022 Apr 14;185(8):1373-1388.e20. doi: 10.1016/j.cell.2022.03.011. Epub 2022 Apr 4.

Benchmarking atlas-level data integration in single-cell genomics.单细胞基因组学中图谱级数据整合的基准测试。

Nat Methods. 2022 Jan;19(1):41-50. doi: 10.1038/s41592-021-01336-8. Epub 2021 Dec 23.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

scPanel：一种使用 scRNA-seq 数据集进行通用患者分类的自动识别稀疏基因面板的工具。

scPanel: a tool for automatic identification of sparse gene panels for generalizable patient classification using scRNA-seq datasets.

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献