Suppr超能文献

基因集分析用于纵向基因表达数据。

Gene set analysis for longitudinal gene expression data.

机构信息

School of Medicine & Health Sciences, University of North Dakota, Grand Forks, ND 58202, USA.

出版信息

BMC Bioinformatics. 2011 Jul 3;12:273. doi: 10.1186/1471-2105-12-273.

Abstract

BACKGROUND

Gene set analysis (GSA) has become a successful tool to interpret gene expression profiles in terms of biological functions, molecular pathways, or genomic locations. GSA performs statistical tests for independent microarray samples at the level of gene sets rather than individual genes. Nowadays, an increasing number of microarray studies are conducted to explore the dynamic changes of gene expression in a variety of species and biological scenarios. In these longitudinal studies, gene expression is repeatedly measured over time such that a GSA needs to take into account the within-gene correlations in addition to possible between-gene correlations.

RESULTS

We provide a robust nonparametric approach to compare the expressions of longitudinally measured sets of genes under multiple treatments or experimental conditions. The limiting distributions of our statistics are derived when the number of genes goes to infinity while the number of replications can be small. When the number of genes in a gene set is small, we recommend permutation tests based on our nonparametric test statistics to achieve reliable type I error and better power while incorporating unknown correlations between and within-genes. Simulation results demonstrate that the proposed method has a greater power than other methods for various data distributions and heteroscedastic correlation structures. This method was used for an IL-2 stimulation study and significantly altered gene sets were identified.

CONCLUSIONS

The simulation study and the real data application showed that the proposed gene set analysis provides a promising tool for longitudinal microarray analysis. R scripts for simulating longitudinal data and calculating the nonparametric statistics are posted on the North Dakota INBRE website http://ndinbre.org/programs/bioinformatics.php. Raw microarray data is available in Gene Expression Omnibus (National Center for Biotechnology Information) with accession number GSE6085.

摘要

背景

基因集分析(GSA)已成为一种成功的工具,可以根据生物功能、分子途径或基因组位置来解释基因表达谱。GSA 在基因集水平而不是单个基因上对独立的微阵列样本执行统计检验。如今,越来越多的微阵列研究用于探索各种物种和生物场景中基因表达的动态变化。在这些纵向研究中,基因表达随时间重复测量,因此 GSA 需要考虑到基因内相关性,以及可能的基因间相关性。

结果

我们提供了一种稳健的非参数方法来比较多个处理或实验条件下纵向测量的基因集的表达。当基因数量趋于无穷大而重复次数可以很小时,我们的统计量的极限分布被推导出来。当基因集中的基因数量较小时,我们建议基于我们的非参数检验统计量进行置换检验,以在纳入基因间和基因内未知相关性的同时实现可靠的Ⅰ型错误和更好的功效。模拟结果表明,该方法在各种数据分布和异方差相关结构下比其他方法具有更高的功效。该方法用于 IL-2 刺激研究,鉴定出显著改变的基因集。

结论

模拟研究和实际数据应用表明,所提出的基因集分析为纵向微阵列分析提供了一种有前途的工具。用于模拟纵向数据和计算非参数统计量的 R 脚本可在北达科他州 INBRE 网站 http://ndinbre.org/programs/bioinformatics.php 上获得。原始微阵列数据可在基因表达综合数据库(美国国家生物技术信息中心)中获得,访问号为 GSE6085。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d126/3142525/2f34934fb96a/1471-2105-12-273-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验