Suppr超能文献

iDESC:基于多个样本的单细胞 RNA 测序数据差异表达识别。

iDESC: identifying differential expression in single-cell RNA sequencing data with multiple subjects.

机构信息

Department of Biostatistics, Yale School of Public Health, New Haven, CT, 06520, USA.

Section of Pulmonary, Critical Care and Sleep Medicine, Yale School of Medicine, New Haven, CT, 06520, USA.

出版信息

BMC Bioinformatics. 2023 Aug 22;24(1):318. doi: 10.1186/s12859-023-05432-8.

Abstract

BACKGROUND

Single-cell RNA sequencing (scRNA-seq) technology has enabled assessment of transcriptome-wide changes at single-cell resolution. Due to the heterogeneity in environmental exposure and genetic background across subjects, subject effect contributes to the major source of variation in scRNA-seq data with multiple subjects, which severely confounds cell type specific differential expression (DE) analysis. Moreover, dropout events are prevalent in scRNA-seq data, leading to excessive number of zeroes in the data, which further aggravates the challenge in DE analysis.

RESULTS

We developed iDESC to detect cell type specific DE genes between two groups of subjects in scRNA-seq data. iDESC uses a zero-inflated negative binomial mixed model to consider both subject effect and dropouts. The prevalence of dropout events (dropout rate) was demonstrated to be dependent on gene expression level, which is modeled by pooling information across genes. Subject effect is modeled as a random effect in the log-mean of the negative binomial component. We evaluated and compared the performance of iDESC with eleven existing DE analysis methods. Using simulated data, we demonstrated that iDESC had well-controlled type I error and higher power compared to the existing methods. Applications of those methods with well-controlled type I error to three real scRNA-seq datasets from the same tissue and disease showed that the results of iDESC achieved the best consistency between datasets and the best disease relevance.

CONCLUSIONS

iDESC was able to achieve more accurate and robust DE analysis results by separating subject effect from disease effect with consideration of dropouts to identify DE genes, suggesting the importance of considering subject effect and dropouts in the DE analysis of scRNA-seq data with multiple subjects.

摘要

背景

单细胞 RNA 测序(scRNA-seq)技术使人们能够在单细胞分辨率下评估转录组的变化。由于个体之间环境暴露和遗传背景的异质性,个体效应是多个体 scRNA-seq 数据中主要的变异来源之一,严重干扰了细胞类型特异性差异表达(DE)分析。此外,scRNA-seq 数据中普遍存在缺失事件,导致数据中出现大量零值,进一步加剧了 DE 分析的挑战。

结果

我们开发了 iDESC 来检测 scRNA-seq 数据中两组个体之间的细胞类型特异性 DE 基因。iDESC 使用零膨胀负二项混合模型来考虑个体效应和缺失值。缺失事件的发生率(缺失率)被证明与基因表达水平有关,这是通过跨基因信息池化来建模的。个体效应在负二项成分的对数均值中被建模为随机效应。我们评估并比较了 iDESC 与 11 种现有的 DE 分析方法的性能。使用模拟数据,我们表明 iDESC 具有良好的控制型 I 错误和比现有方法更高的功效。将那些具有良好控制型 I 错误的方法应用于来自同一组织和疾病的三个真实 scRNA-seq 数据集,表明 iDESC 的结果在数据集之间具有最佳的一致性,并且与疾病的相关性最好。

结论

iDESC 通过考虑缺失值来分离个体效应和疾病效应,从而能够更准确和稳健地进行 DE 分析,以识别 DE 基因,这表明在多个体 scRNA-seq 数据的 DE 分析中考虑个体效应和缺失值的重要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/844b/10463720/dcb61098ac5c/12859_2023_5432_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验