Suppr超能文献

SmartImpute:一种用于单细胞转录组数据的靶向插补框架。

SmartImpute: A Targeted Imputation Framework for Single-cell Transcriptome Data.

作者信息

Yao Sijie, Yu Xiaoqing, Wang Xuefeng

机构信息

Department of Biostatistics and Bioinformatics, H. Lee Moffitt Cancer Center and Research Institution, Tampa, Florida, 33612, USA.

出版信息

bioRxiv. 2024 Jul 18:2024.07.15.603649. doi: 10.1101/2024.07.15.603649.

Abstract

Single-cell RNA sequencing (scRNA-seq) has revolutionized our understanding of cellular heterogeneity and tissue transcriptomic complexity. However, the high frequency of dropout events in scRNA-seq data complicates downstream analyses such as cell type identification and trajectory inference. Existing imputation methods address the dropout problem but face limitations such as high computational cost and risk of over-imputation. We present SmartImpute, a novel computational framework designed for targeted imputation of scRNA-seq data. SmartImpute focuses on a predefined set of marker genes, enhancing the biological relevance and computational efficiency of the imputation process while minimizing the risk of model misspecification. Utilizing a modified Generative Adversarial Imputation Network architecture, SmartImpute accurately imputes the missing gene expression and distinguishes between true biological zeros and missing values, preventing overfitting and preserving biologically relevant zeros. To ensure reproducibility, we also provide a function based on the GPT4 model to create target gene panels depending on the tissue types and research context. Our results, based on scRNA-seq data from head and neck squamous cell carcinoma and human bone marrow, demonstrate that SmartImpute significantly enhances cell type annotation and clustering accuracy while reducing computational burden. Benchmarking against other imputation methods highlights SmartImpute's superior performance in terms of both accuracy and efficiency. Overall, SmartImpute provides a lightweight, efficient, and biologically relevant solution for addressing dropout events in scRNA-seq data, facilitating deeper insights into cellular heterogeneity and disease progression. Furthermore, SmartImpute's targeted approach can be extended to spatial omics data, which also contain many missing values.

摘要

单细胞RNA测序(scRNA-seq)彻底改变了我们对细胞异质性和组织转录组复杂性的理解。然而,scRNA-seq数据中高频率的缺失事件使细胞类型识别和轨迹推断等下游分析变得复杂。现有的插补方法解决了缺失问题,但面临着计算成本高和过度插补风险等局限性。我们提出了SmartImpute,这是一种专为scRNA-seq数据的靶向插补设计的新型计算框架。SmartImpute专注于一组预定义的标记基因,提高了插补过程的生物学相关性和计算效率,同时将模型错误指定的风险降至最低。利用改进的生成对抗插补网络架构,SmartImpute准确地插补缺失的基因表达,并区分真正的生物学零值和缺失值,防止过拟合并保留生物学相关的零值。为确保可重复性,我们还提供了一个基于GPT4模型的函数,根据组织类型和研究背景创建目标基因面板。我们基于头颈部鳞状细胞癌和人类骨髓的scRNA-seq数据的结果表明,SmartImpute显著提高了细胞类型注释和聚类准确性,同时减轻了计算负担。与其他插补方法的基准测试突出了SmartImpute在准确性和效率方面的卓越性能。总体而言,SmartImpute为解决scRNA-seq数据中的缺失事件提供了一种轻量级、高效且生物学相关的解决方案,有助于更深入地了解细胞异质性和疾病进展。此外,SmartImpute的靶向方法可以扩展到空间组学数据,这类数据也包含许多缺失值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/56fa/11275709/3aab4ddda5c0/nihpp-2024.07.15.603649v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验