• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

纵向RNA测序数据的时程基因集分析的方差成分评分检验

Variance component score test for time-course gene set analysis of longitudinal RNA-seq data.

作者信息

Agniel Denis, Hejblum Boris P

机构信息

Department of Biomedical Informatics, Harvard Medical School, 10 Shattuck St, Boston, MA 02115, USA.

Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA University of Bordeaux, ISPED, INSERM U1219, INRIA SISTM, 146 rue Léo Saignat, 33076 Bordeaux, FRANCE Vaccine Research Institute, Créteil, FRANCE.

出版信息

Biostatistics. 2017 Oct 1;18(4):589-604. doi: 10.1093/biostatistics/kxx005.

DOI:10.1093/biostatistics/kxx005
PMID:28334305
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5862256/
Abstract

As gene expression measurement technology is shifting from microarrays to sequencing, the statistical tools available for their analysis must be adapted since RNA-seq data are measured as counts. It has been proposed to model RNA-seq counts as continuous variables using nonparametric regression to account for their inherent heteroscedasticity. In this vein, we propose tcgsaseq, a principled, model-free, and efficient method for detecting longitudinal changes in RNA-seq gene sets defined a priori. The method identifies those gene sets whose expression varies over time, based on an original variance component score test accounting for both covariates and heteroscedasticity without assuming any specific parametric distribution for the (transformed) counts. We demonstrate that despite the presence of a nonparametric component, our test statistic has a simple form and limiting distribution, and both may be computed quickly. A permutation version of the test is additionally proposed for very small sample sizes. Applied to both simulated data and two real datasets, tcgsaseq is shown to exhibit very good statistical properties, with an increase in stability and power when compared to state-of-the-art methods ROAST (rotation gene set testing), edgeR, and DESeq2, which can fail to control the type I error under certain realistic settings. We have made the method available for the community in the R package tcgsaseq.

摘要

随着基因表达测量技术从微阵列转向测序,由于RNA测序数据是以计数形式测量的,因此用于分析它们的统计工具必须进行调整。有人提议使用非参数回归将RNA测序计数建模为连续变量,以考虑其固有的异方差性。在此背景下,我们提出了tcgsaseq,这是一种有原则、无模型且高效的方法,用于检测先验定义的RNA测序基因集的纵向变化。该方法基于一个原始的方差成分得分检验来识别那些表达随时间变化的基因集,该检验同时考虑了协变量和异方差性,而无需对(转换后的)计数假设任何特定的参数分布。我们证明,尽管存在非参数成分,但我们的检验统计量具有简单的形式和极限分布,并且两者都可以快速计算。此外,还针对非常小的样本量提出了该检验的置换版本。应用于模拟数据和两个真实数据集时,tcgsaseq显示出非常好的统计特性,与现有方法ROAST(旋转基因集检验)、edgeR和DESeq2相比,其稳定性和功效有所提高,而这些现有方法在某些实际设置下可能无法控制I型错误。我们已通过R包tcgsaseq将该方法提供给社区使用。

相似文献

1
Variance component score test for time-course gene set analysis of longitudinal RNA-seq data.纵向RNA测序数据的时程基因集分析的方差成分评分检验
Biostatistics. 2017 Oct 1;18(4):589-604. doi: 10.1093/biostatistics/kxx005.
2
SimSeq: a nonparametric approach to simulation of RNA-sequence datasets.SimSeq:一种用于RNA序列数据集模拟的非参数方法。
Bioinformatics. 2015 Jul 1;31(13):2131-40. doi: 10.1093/bioinformatics/btv124. Epub 2015 Feb 26.
3
PLNseq: a multivariate Poisson lognormal distribution for high-throughput matched RNA-sequencing read count data.PLNseq:一种用于高通量匹配RNA测序读数计数数据的多元泊松对数正态分布。
Stat Med. 2015 Apr 30;34(9):1577-89. doi: 10.1002/sim.6449. Epub 2015 Jan 30.
4
A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data.用于RNA测序数据差异表达分析的每个样本全局缩放和每个基因归一化方法的比较。
PLoS One. 2017 May 1;12(5):e0176185. doi: 10.1371/journal.pone.0176185. eCollection 2017.
5
Comparative evaluation of gene set analysis approaches for RNA-Seq data.RNA测序数据基因集分析方法的比较评估
BMC Bioinformatics. 2014 Dec 5;15(1):397. doi: 10.1186/s12859-014-0397-8.
6
Synthetic data sets for the identification of key ingredients for RNA-seq differential analysis.用于鉴定 RNA-seq 差异分析关键成分的合成数据集。
Brief Bioinform. 2018 Jan 1;19(1):65-76. doi: 10.1093/bib/bbw092.
7
rSeqNP: a non-parametric approach for detecting differential expression and splicing from RNA-Seq data.rSeqNP:一种用于从RNA测序数据中检测差异表达和剪接的非参数方法。
Bioinformatics. 2015 Jul 1;31(13):2222-4. doi: 10.1093/bioinformatics/btv119. Epub 2015 Feb 24.
8
An evaluation of RNA-seq differential analysis methods.RNA-seq 差异分析方法评估。
PLoS One. 2022 Sep 16;17(9):e0264246. doi: 10.1371/journal.pone.0264246. eCollection 2022.
9
Experimental Design and Power Calculation for RNA-seq Experiments.RNA测序实验的实验设计与功效计算
Methods Mol Biol. 2016;1418:379-90. doi: 10.1007/978-1-4939-3578-9_18.
10
Pathway analysis for RNA-Seq data using a score-based approach.使用基于评分的方法对RNA测序数据进行通路分析。
Biometrics. 2016 Mar;72(1):165-74. doi: 10.1111/biom.12372. Epub 2015 Aug 10.

引用本文的文献

1
Unleashing the power within short-read RNA-seq for plant research: Beyond differential expression analysis and toward regulomics.释放短读长RNA测序在植物研究中的潜能:超越差异表达分析,迈向调控组学。
Front Plant Sci. 2022 Dec 8;13:1038109. doi: 10.3389/fpls.2022.1038109. eCollection 2022.
2
High-temporal resolution profiling reveals distinct immune trajectories following the first and second doses of COVID-19 mRNA vaccines.高时间分辨率分析揭示了 COVID-19 mRNA 疫苗接种第一针和第二针后的不同免疫轨迹。
Sci Adv. 2022 Nov 11;8(45):eabp9961. doi: 10.1126/sciadv.abp9961.
3
Temporal Dynamic Methods for Bulk RNA-Seq Time Series Data.批量 RNA-Seq 时间序列数据的时间动态方法。
Genes (Basel). 2021 Feb 27;12(3):352. doi: 10.3390/genes12030352.
4
dearseq: a variance component score test for RNA-seq differential analysis that effectively controls the false discovery rate.DearSeq:一种用于RNA测序差异分析的方差成分评分检验,可有效控制错误发现率。
NAR Genom Bioinform. 2020 Nov 19;2(4):lqaa093. doi: 10.1093/nargab/lqaa093. eCollection 2020 Dec.
5
MCMSeq: Bayesian hierarchical modeling of clustered and repeated measures RNA sequencing experiments.MCMSeq:用于聚类和重复测量 RNA 测序实验的贝叶斯层次模型。
BMC Bioinformatics. 2020 Aug 28;21(1):375. doi: 10.1186/s12859-020-03715-y.
6
rmRNAseq: differential expression analysis for repeated-measures RNA-seq data.rmRNAseq:重复测量 RNA-seq 数据的差异表达分析。
Bioinformatics. 2020 Aug 15;36(16):4432-4439. doi: 10.1093/bioinformatics/btaa525.
7
Airway transcriptomic profiling after bronchial thermoplasty.支气管热成形术后气道转录组分析
ERJ Open Res. 2019 Feb 18;5(1). doi: 10.1183/23120541.00123-2018. eCollection 2019 Feb.

本文引用的文献

1
Habitat-Associated Life History and Stress-Tolerance Variation in Arabidopsis arenosa.拟南芥中与栖息地相关的生活史及胁迫耐受性变异
Plant Physiol. 2016 May;171(1):437-51. doi: 10.1104/pp.15.01875. Epub 2016 Mar 3.
2
What if we ignore the random effects when analyzing RNA-seq data in a multifactor experiment.在多因素实验中分析RNA测序数据时,如果我们忽略随机效应会怎样?
Stat Appl Genet Mol Biol. 2016 Apr;15(2):87-105. doi: 10.1515/sagmb-2015-0011.
3
Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline.RNA测序数据的基因集分析方法:性能评估与应用指南
Brief Bioinform. 2016 May;17(3):393-407. doi: 10.1093/bib/bbv069. Epub 2015 Sep 4.
4
Time-Course Gene Set Analysis for Longitudinal Gene Expression Data.纵向基因表达数据的时间进程基因集分析
PLoS Comput Biol. 2015 Jun 25;11(6):e1004310. doi: 10.1371/journal.pcbi.1004310. eCollection 2015 Jun.
5
Differentially expressed gene transcripts using RNA sequencing from the blood of immunosuppressed kidney allograft recipients.使用免疫抑制肾移植受者血液的RNA测序技术检测差异表达的基因转录本
PLoS One. 2015 May 6;10(5):e0125045. doi: 10.1371/journal.pone.0125045. eCollection 2015.
6
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2.使用DESeq2对RNA测序数据的倍数变化和离散度进行适度估计。
Genome Biol. 2014;15(12):550. doi: 10.1186/s13059-014-0550-8.
7
Next maSigPro: updating maSigPro bioconductor package for RNA-seq time series.Next maSigPro:更新用于RNA测序时间序列的maSigPro生物导体包。
Bioinformatics. 2014 Sep 15;30(18):2598-602. doi: 10.1093/bioinformatics/btu333. Epub 2014 Jun 3.
8
voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.voom:精确权重为RNA测序读数计数解锁线性模型分析工具。
Genome Biol. 2014 Feb 3;15(2):R29. doi: 10.1186/gb-2014-15-2-r29.
9
Detection of deregulated modules using deregulatory linked path.使用失调关联路径检测失调模块。
PLoS One. 2013 Jul 24;8(7):e70412. doi: 10.1371/journal.pone.0070412. Print 2013.
10
Gene set analysis using variance component tests.基于方差分量检验的基因集分析。
BMC Bioinformatics. 2013 Jun 28;14:210. doi: 10.1186/1471-2105-14-210.