• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大规模最大平均功率多重推断在时间序列计数数据中的应用及在 RNA-seq 分析中的应用。

Large scale maximum average power multiple inference on time-course count data with application to RNA-seq analysis.

机构信息

Department of Statistics, Colorado State University, Fort Collins, Colorado.

Department of Biology, Colorado State University, Fort Collins, Colorado.

出版信息

Biometrics. 2020 Mar;76(1):9-22. doi: 10.1111/biom.13144. Epub 2019 Nov 14.

DOI:10.1111/biom.13144
PMID:31483480
Abstract

Experiments that longitudinally collect RNA sequencing (RNA-seq) data can provide transformative insights in biology research by revealing the dynamic patterns of genes. Such experiments create a great demand for new analytic approaches to identify differentially expressed (DE) genes based on large-scale time-course count data. Existing methods, however, are suboptimal with respect to power and may lack theoretical justification. Furthermore, most existing tests are designed to distinguish among conditions based on overall differential patterns across time, though in practice, a variety of composite hypotheses are of more scientific interest. Finally, some current methods may fail to control the false discovery rate. In this paper, we propose a new model and testing procedure to address the above issues simultaneously. Specifically, conditional on a latent Gaussian mixture with evolving means, we model the data by negative binomial distributions. Motivated by Storey (2007) and Hwang and Liu (2010), we introduce a general testing framework based on the proposed model and show that the proposed test enjoys the optimality property of maximum average power. The test allows not only identification of traditional DE genes but also testing of a variety of composite hypotheses of biological interest. We establish the identifiability of the proposed model, implement the proposed method via efficient algorithms, and demonstrate its good performance via simulation studies. The procedure reveals interesting biological insights, when applied to data from an experiment that examines the effect of varying light environments on the fundamental physiology of the marine diatom Phaeodactylum tricornutum.

摘要

进行纵向收集 RNA 测序(RNA-seq)数据的实验可以通过揭示基因的动态模式,为生物学研究提供变革性的见解。此类实验对新的分析方法提出了巨大需求,以便根据大规模时间序列计数数据识别差异表达(DE)基因。然而,现有的方法在功效方面并不理想,并且可能缺乏理论依据。此外,大多数现有的检验方法是基于整个时间点的整体差异模式来区分条件的,尽管实际上,各种综合假设更具有科学意义。最后,一些当前的方法可能无法控制错误发现率。在本文中,我们提出了一种新的模型和检验程序来同时解决上述问题。具体来说,在具有演变均值的潜在高斯混合条件下,我们通过负二项式分布对数据进行建模。受 Storey(2007)和 Hwang 和 Liu(2010)的启发,我们引入了一个基于所提出模型的一般检验框架,并表明所提出的检验具有最大平均功效的最优性。该检验不仅可以识别传统的 DE 基因,还可以检验各种具有生物学意义的综合假设。我们确定了所提出模型的可识别性,通过有效的算法实现了所提出的方法,并通过模拟研究证明了其良好的性能。当应用于研究不同光照环境对海洋硅藻 Phaeodactylum tricornutum 基本生理影响的实验数据时,该程序揭示了有趣的生物学见解。

相似文献

1
Large scale maximum average power multiple inference on time-course count data with application to RNA-seq analysis.大规模最大平均功率多重推断在时间序列计数数据中的应用及在 RNA-seq 分析中的应用。
Biometrics. 2020 Mar;76(1):9-22. doi: 10.1111/biom.13144. Epub 2019 Nov 14.
2
An optimal test with maximum average power while controlling FDR with application to RNA-seq data.一种在控制错误发现率(FDR)的同时具有最大平均功效的最优检验及其在RNA测序数据中的应用。
Biometrics. 2013 Sep;69(3):594-605. doi: 10.1111/biom.12036. Epub 2013 Jul 26.
3
Sample size calculations for the differential expression analysis of RNA-seq data using a negative binomial regression model.使用负二项回归模型对RNA测序数据进行差异表达分析的样本量计算。
Stat Appl Genet Mol Biol. 2019 Jan 22;18(1):/j/sagmb.2019.18.issue-1/sagmb-2018-0021/sagmb-2018-0021.xml. doi: 10.1515/sagmb-2018-0021.
4
Power analysis for RNA-Seq differential expression studies using generalized linear mixed effects models.基于广义线性混合效应模型的 RNA-Seq 差异表达研究的功效分析。
BMC Bioinformatics. 2020 May 19;21(1):198. doi: 10.1186/s12859-020-3541-7.
5
Sample size calculation while controlling false discovery rate for differential expression analysis with RNA-sequencing experiments.在RNA测序实验的差异表达分析中控制错误发现率时的样本量计算。
BMC Bioinformatics. 2016 Mar 31;17:146. doi: 10.1186/s12859-016-0994-9.
6
An evaluation of RNA-seq differential analysis methods.RNA-seq 差异分析方法评估。
PLoS One. 2022 Sep 16;17(9):e0264246. doi: 10.1371/journal.pone.0264246. eCollection 2022.
7
Robust identification of differentially expressed genes from RNA-seq data.从 RNA-seq 数据中稳健地识别差异表达基因。
Genomics. 2020 Mar;112(2):2000-2010. doi: 10.1016/j.ygeno.2019.11.012. Epub 2019 Nov 20.
8
Differentially expressed heterogeneous overdispersion genes testing for count data.针对计数数据的差异表达异质性过度离散基因检测
PLoS One. 2024 Jul 17;19(7):e0300565. doi: 10.1371/journal.pone.0300565. eCollection 2024.
9
A two-step integrated approach to detect differentially expressed genes in RNA-Seq data.一种用于检测RNA测序数据中差异表达基因的两步综合方法。
J Bioinform Comput Biol. 2016 Dec;14(6):1650034. doi: 10.1142/S0219720016500347. Epub 2016 Sep 15.
10
On the utility of RNA sample pooling to optimize cost and statistical power in RNA sequencing experiments.关于在 RNA 测序实验中利用 RNA 样本池化来优化成本和统计功效的研究。
BMC Genomics. 2020 Apr 19;21(1):312. doi: 10.1186/s12864-020-6721-y.

引用本文的文献

1
Temporal Dynamic Methods for Bulk RNA-Seq Time Series Data.批量 RNA-Seq 时间序列数据的时间动态方法。
Genes (Basel). 2021 Feb 27;12(3):352. doi: 10.3390/genes12030352.