• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Hierarchicell:一个用于估计单细胞数据差异表达检验功效的R包。

Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data.

作者信息

Zimmerman Kip D, Langefeld Carl D

机构信息

Center for Precision Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA.

Department of Biostatistics and Data Science, Wake Forest School of Medicine, Winston-Salem, NC, USA.

出版信息

BMC Genomics. 2021 May 1;22(1):319. doi: 10.1186/s12864-021-07635-w.

DOI:10.1186/s12864-021-07635-w
PMID:33932993
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8088563/
Abstract

BACKGROUND

Study design is a critical aspect of any experiment, and sample size calculations for statistical power that are consistent with that study design are central to robust and reproducible results. However, the existing power calculators for tests of differential expression in single-cell RNA-seq data focus on the total number of cells and not the number of independent experimental units, the true unit of interest for power. Thus, current methods grossly overestimate the power.

RESULTS

Hierarchicell is the first single-cell power calculator to explicitly simulate and account for the hierarchical correlation structure (i.e., within sample correlation) that exists in single-cell RNA-seq data. Hierarchicell, an R-package available on GitHub, estimates the within sample correlation structure from real data to simulate hierarchical single-cell RNA-seq data and estimate power for tests of differential expression. This multi-stage approach models gene dropout rates, intra-individual dispersion, inter-individual variation, variable or fixed number of cells per individual, and the correlation among cells within an individual. Without modeling the within sample correlation structure and without properly accounting for the correlation in downstream analysis, we demonstrate that estimates of power are falsely inflated. Hierarchicell can be used to estimate power for binary and continuous phenotypes based on user-specified number of independent experimental units (e.g., individuals) and cells within the experimental unit.

CONCLUSIONS

Hierarchicell is a user-friendly R-package that provides accurate estimates of power for testing hypotheses of differential expression in single-cell RNA-seq data. This R-package represents an important addition to single-cell RNA analytic tools and will help researchers design experiments with appropriate and accurate power, increasing discovery and improving robustness and reproducibility.

摘要

背景

研究设计是任何实验的关键方面,与该研究设计一致的用于统计功效的样本量计算对于获得可靠且可重复的结果至关重要。然而,现有的用于单细胞RNA测序数据差异表达检验的功效计算器关注的是细胞总数,而非独立实验单位的数量,而独立实验单位才是功效真正感兴趣的单位。因此,当前方法严重高估了功效。

结果

Hierarchicell是首个明确模拟并考虑单细胞RNA测序数据中存在的层次相关结构(即样本内相关性)的单细胞功效计算器。Hierarchicell是一个可在GitHub上获取的R包,它从实际数据估计样本内相关结构,以模拟层次单细胞RNA测序数据并估计差异表达检验的功效。这种多阶段方法对基因脱落率、个体内离散度、个体间变异、每个个体可变或固定的细胞数量以及个体内细胞间的相关性进行建模。我们证明,如果不模拟样本内相关结构且不在下游分析中正确考虑相关性,功效估计会被错误地夸大。Hierarchicell可用于根据用户指定的独立实验单位(如个体)数量和实验单位内的细胞数量,估计二元和连续表型的功效。

结论

Hierarchicell是一个用户友好的R包,可提供用于检验单细胞RNA测序数据中差异表达假设的准确功效估计。这个R包是单细胞RNA分析工具的重要补充,将帮助研究人员设计具有适当且准确功效的实验,增加发现并提高稳健性和可重复性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f566/8088563/8f537df59910/12864_2021_7635_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f566/8088563/d84503ff5936/12864_2021_7635_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f566/8088563/0830682c2b5b/12864_2021_7635_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f566/8088563/8f537df59910/12864_2021_7635_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f566/8088563/d84503ff5936/12864_2021_7635_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f566/8088563/0830682c2b5b/12864_2021_7635_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f566/8088563/8f537df59910/12864_2021_7635_Fig3_HTML.jpg

相似文献

1
Hierarchicell: an R-package for estimating power for tests of differential expression with single-cell data.Hierarchicell:一个用于估计单细胞数据差异表达检验功效的R包。
BMC Genomics. 2021 May 1;22(1):319. doi: 10.1186/s12864-021-07635-w.
2
TWO-SIGMA: A novel two-component single cell model-based association method for single-cell RNA-seq data.双西格玛:一种新型基于双组份单细胞模型的单细胞 RNA-seq 数据关联方法。
Genet Epidemiol. 2021 Mar;45(2):142-153. doi: 10.1002/gepi.22361. Epub 2020 Sep 29.
3
Power analysis and sample size estimation for RNA-Seq differential expression.RNA测序差异表达的功效分析与样本量估计
RNA. 2014 Nov;20(11):1684-96. doi: 10.1261/rna.046011.114. Epub 2014 Sep 22.
4
Simulation, power evaluation and sample size recommendation for single-cell RNA-seq.单细胞 RNA-seq 的模拟、效能评估与样本量推荐。
Bioinformatics. 2020 Dec 8;36(19):4860-4868. doi: 10.1093/bioinformatics/btaa607.
5
Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments.在RNA测序实验的差异表达分析中控制错误发现率时的样本量计算。
BMC Bioinformatics. 2016 Mar 31;17:146. doi: 10.1186/s12859-016-0994-9.
6
bakR: uncovering differential RNA synthesis and degradation kinetics transcriptome-wide with Bayesian hierarchical modeling.bakR:基于贝叶斯分层模型揭示全转录组中 RNA 合成和降解的差异动力学。
RNA. 2023 Jul;29(7):958-976. doi: 10.1261/rna.079451.122. Epub 2023 Apr 7.
7
Accounting for technical noise in differential expression analysis of single-cell RNA sequencing data.单细胞RNA测序数据差异表达分析中的技术噪声处理
Nucleic Acids Res. 2017 Nov 2;45(19):10978-10988. doi: 10.1093/nar/gkx754.
8
rmRNAseq: differential expression analysis for repeated-measures RNA-seq data.rmRNAseq:重复测量 RNA-seq 数据的差异表达分析。
Bioinformatics. 2020 Aug 15;36(16):4432-4439. doi: 10.1093/bioinformatics/btaa525.
9
Experimental Design and Power Calculation for RNA-seq Experiments.RNA测序实验的实验设计与功效计算
Methods Mol Biol. 2016;1418:379-90. doi: 10.1007/978-1-4939-3578-9_18.
10
LTMG: a novel statistical modeling of transcriptional expression states in single-cell RNA-Seq data.LTMG:一种单细胞 RNA-Seq 数据中转录表达状态的新型统计建模方法。
Nucleic Acids Res. 2019 Oct 10;47(18):e111. doi: 10.1093/nar/gkz655.

引用本文的文献

1
A distribution-free and analytic method for power and sample size calculation in single-cell differential expression.无分布和解析方法在单细胞差异表达中的功效和样本量计算
Bioinformatics. 2024 Sep 2;40(9). doi: 10.1093/bioinformatics/btae540.
2
Single-nucleus multiomics reveals the disrupted regulatory programs in three brain regions of sporadic early-onset Alzheimer's disease.单核多组学揭示散发性早发型阿尔茨海默病三个脑区中失调的调控程序。
Res Sq. 2024 Aug 1:rs.3.rs-4622123. doi: 10.21203/rs.3.rs-4622123/v1.
3
Single-nucleus multiomics reveals the disrupted regulatory programs in three brain regions of sporadic early-onset Alzheimer's disease.

本文引用的文献

1
A practical solution to pseudoreplication bias in single-cell studies.单细胞研究中拟似重复偏倚的实用解决方案。
Nat Commun. 2021 Feb 2;12(1):738. doi: 10.1038/s41467-021-21038-1.
2
Tutorial: guidelines for the computational analysis of single-cell RNA sequencing data.教程:单细胞 RNA 测序数据分析的计算分析指南。
Nat Protoc. 2021 Jan;16(1):1-9. doi: 10.1038/s41596-020-00409-w. Epub 2020 Dec 7.
3
SCOPIT: sample size calculations for single-cell sequencing experiments.SCOPIT:单细胞测序实验的样本量计算。
单核多组学揭示散发性早发性阿尔茨海默病三个脑区中被破坏的调控程序。
bioRxiv. 2024 Jun 29:2024.06.25.600720. doi: 10.1101/2024.06.25.600720.
4
The shaky foundations of simulating single-cell RNA sequencing data.模拟单细胞 RNA 测序数据的不稳固基础。
Genome Biol. 2023 Mar 29;24(1):62. doi: 10.1186/s13059-023-02904-1.
5
Statistical Power Analysis for Designing Bulk, Single-Cell, and Spatial Transcriptomics Experiments: Review, Tutorial, and Perspectives.用于设计批量、单细胞和空间转录组学实验的统计功效分析:综述、教程和观点。
Biomolecules. 2023 Jan 24;13(2):221. doi: 10.3390/biom13020221.
6
Perspectives on rigor and reproducibility in single cell genomics.单细胞基因组学中关于严谨性和可重复性的观点。
PLoS Genet. 2022 May 10;18(5):e1010210. doi: 10.1371/journal.pgen.1010210. eCollection 2022 May.
7
Single-cell expression quantitative trait loci (eQTL) analysis of SLE-risk loci in lupus patient monocytes.狼疮患者单核细胞中 SLE 风险位点的单细胞表达数量性状基因座 (eQTL) 分析。
Arthritis Res Ther. 2021 Nov 30;23(1):290. doi: 10.1186/s13075-021-02660-2.
BMC Bioinformatics. 2019 Nov 12;20(1):566. doi: 10.1186/s12859-019-3167-9.
4
SPARSim single cell: a count data simulator for scRNA-seq data.SPARSim 单细胞:用于 scRNA-seq 数据的计数数据模拟器。
Bioinformatics. 2020 Mar 1;36(5):1468-1475. doi: 10.1093/bioinformatics/btz752.
5
A statistical simulator scDesign for rational scRNA-seq experimental design.scDesign:用于合理 scRNA-seq 实验设计的统计仿真器。
Bioinformatics. 2019 Jul 15;35(14):i41-i50. doi: 10.1093/bioinformatics/btz321.
6
Statistical power in genome-wide association studies and quantitative trait locus mapping.全基因组关联研究和数量性状位点作图中的统计功效。
Heredity (Edinb). 2019 Sep;123(3):287-306. doi: 10.1038/s41437-019-0205-3. Epub 2019 Mar 11.
7
RnaSeqSampleSize: real data based sample size estimation for RNA sequencing.RNA-seq 样本量:基于真实数据的 RNA 测序样本量估计。
BMC Bioinformatics. 2018 May 30;19(1):191. doi: 10.1186/s12859-018-2191-5.
8
Analysis of hierarchical biomechanical data structures using mixed-effects models.使用混合效应模型分析分层生物力学数据结构。
J Biomech. 2018 Mar 1;69:34-39. doi: 10.1016/j.jbiomech.2018.01.013. Epub 2018 Jan 16.
9
powsimR: power analysis for bulk and single cell RNA-seq experiments.powsimR:用于批量和单细胞 RNA-seq 实验的功效分析。
Bioinformatics. 2017 Nov 1;33(21):3486-3488. doi: 10.1093/bioinformatics/btx435.
10
Splatter: simulation of single-cell RNA sequencing data.Splatter:单细胞 RNA 测序数据模拟。
Genome Biol. 2017 Sep 12;18(1):174. doi: 10.1186/s13059-017-1305-0.