Suppr超能文献

对干细胞的多维、时间性基因表达进行统计分析,以阐明集落大小依赖性神经分化。

Statistical analysis of multi-dimensional, temporal gene expression of stem cells to elucidate colony size-dependent neural differentiation.

作者信息

Joshi Ramila, Fuller Brendan, Li Jun, Tavana Hossein

机构信息

Department of Biomedical Engineering, The University of Akron, 260 S. Forge St., Akron, Ohio 44325, USA.

出版信息

Mol Omics. 2018 Apr 16;14(2):109-120. doi: 10.1039/c8mo00011e.

Abstract

High throughput gene expression analysis using qPCR is commonly used to identify molecular markers of complex cellular processes. However, statistical analysis of multi-dimensional, temporal gene expression data is complicated by limited biological replicates and large number of measurements. Moreover, many available statistical tools for analysis of time series data assume that the data sequence is static and does not evolve over time. With this assumption, the parameters used to model the time series are fixed and thus, can be estimated by pooling data together. However, in many cases, dynamic processes of biological systems involve abrupt changes at unknown time points, making the assumption of stationary time series break down. We addressed this problem using a combination of statistical methods including hierarchical clustering, change point detection, and multiple testing. We applied this multi-step method to multi-dimensional, temporal gene expression data that resulted from our study of colony size-dependent neural cell differentiation of stem cells. The gene expression data were time series as the observations were recorded sequentially over time. Hierarchical clustering segregated the genes into three distinct clusters based on their temporal expression profiles; change point detection identified specific time points at which the entire dataset was divided into several homogenous subsets to allow a separate analysis of each subset; and multiple testing procedure identified the differentially expressed genes in each cluster within each subset of data. We established that our multi-step approach pinpoints specific sets of genes that underlie colony size-mediated neural differentiation of stem cells and demonstrated its advantages over conventional parametric and non-parametric tests that do not take into account temporal dynamics of the data. Importantly, our proposed approach is broadly applicable to any multivariate data sets of limited sample size from high throughput and high content screening such as in drug and biomarker discovery studies.

摘要

使用定量聚合酶链反应(qPCR)进行高通量基因表达分析通常用于识别复杂细胞过程的分子标记。然而,由于生物重复样本有限且测量数量众多,对多维、时间序列基因表达数据进行统计分析变得复杂。此外,许多现有的时间序列数据分析统计工具假定数据序列是静态的,不会随时间演变。基于这一假设,用于对时间序列进行建模的参数是固定的,因此,可以通过将数据合并在一起进行估计。然而,在许多情况下,生物系统的动态过程涉及在未知时间点的突然变化,这使得平稳时间序列的假设不再成立。我们结合使用层次聚类、变化点检测和多重检验等统计方法解决了这个问题。我们将这种多步骤方法应用于多维、时间序列基因表达数据,这些数据来自我们对干细胞集落大小依赖性神经细胞分化的研究。基因表达数据是时间序列数据,因为观察结果是随时间顺序记录的。层次聚类根据基因的时间表达谱将其分为三个不同的簇;变化点检测确定了特定的时间点,在这些时间点整个数据集被分为几个同质子集,以便对每个子集进行单独分析;多重检验程序确定了数据每个子集中每个簇中差异表达的基因。我们确定,我们的多步骤方法能够精确找出构成干细胞集落大小介导的神经分化基础的特定基因集,并证明了其相对于不考虑数据时间动态的传统参数检验和非参数检验的优势。重要的是,我们提出的方法广泛适用于来自高通量和高内涵筛选的任何有限样本量的多变量数据集,如药物和生物标志物发现研究。

相似文献

本文引用的文献

6
T test as a parametric statistic.T检验作为一种参数统计方法。
Korean J Anesthesiol. 2015 Dec;68(6):540-6. doi: 10.4097/kjae.2015.68.6.540. Epub 2015 Nov 25.
8
CDH2 and CDH11 act as regulators of stem cell fate decisions.CDH2和CDH11作为干细胞命运决定的调节因子。
Stem Cell Res. 2015 May;14(3):270-82. doi: 10.1016/j.scr.2015.02.002. Epub 2015 Feb 19.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验