Kayano Mitsunori, Matsui Hidetoshi, Yamaguchi Rui, Imoto Seiya, Miyano Satoru
Department of Animal and Food Hygiene, Obihiro University of Agriculture and Veterinary Medicine, Inada-cho, Obihiro, Hokkaido 080-8555, Japan
Faculty of Mathematics, Kyushu University, 744, Motooka, Nishi-ku, Fukuoka 819-0395, Japan.
Biostatistics. 2016 Apr;17(2):235-48. doi: 10.1093/biostatistics/kxv037. Epub 2015 Sep 28.
High-throughput time course expression profiles have been available in the last decade due to developments in measurement techniques and devices. Functional data analysis, which treats smoothed curves instead of originally observed discrete data, is effective for the time course expression profiles in terms of dimension reduction, robustness, and applicability to data measured at small and irregularly spaced time points. However, the statistical method of differential analysis for time course expression profiles has not been well established. We propose a functional logistic model based on elastic net regularization (F-Logistic) in order to identify the genes with dynamic alterations in case/control study. We employ a mixed model as a smoothing method to obtain functional data; then F-Logistic is applied to time course profiles measured at small and irregularly spaced time points. We evaluate the performance of F-Logistic in comparison with another functional data approach, i.e. functional ANOVA test (F-ANOVA), by applying the methods to real and synthetic time course data sets. The real data sets consist of the time course gene expression profiles for long-term effects of recombinant interferon β on disease progression in multiple sclerosis. F-Logistic distinguishes dynamic alterations, which cannot be found by competitive approaches such as F-ANOVA, in case/control study based on time course expression profiles. F-Logistic is effective for time-dependent biomarker detection, diagnosis, and therapy.
由于测量技术和设备的发展,高通量时间进程表达谱在过去十年中已经可用。功能数据分析处理的是平滑曲线而非最初观察到的离散数据,在降维、稳健性以及对在小的和不规则间隔时间点测量的数据的适用性方面,对时间进程表达谱是有效的。然而,时间进程表达谱的差异分析统计方法尚未得到很好的确立。我们提出一种基于弹性网络正则化的功能逻辑模型(F-Logistic),以便在病例/对照研究中识别具有动态变化的基因。我们采用混合模型作为平滑方法来获得功能数据;然后将F-Logistic应用于在小的和不规则间隔时间点测量的时间进程谱。通过将这些方法应用于真实和合成的时间进程数据集,我们与另一种功能数据方法即功能方差分析检验(F-ANOVA)比较,评估F-Logistic的性能。真实数据集由重组干扰素β对多发性硬化症疾病进展的长期影响的时间进程基因表达谱组成。在基于时间进程表达谱的病例/对照研究中,F-Logistic能够区分动态变化,而这是诸如F-ANOVA等竞争方法所无法发现的。F-Logistic对于时间依赖性生物标志物的检测、诊断和治疗是有效的。