Suppr超能文献

DNA微阵列表达比值独立成分分析的重现性评估

Reproducibility assessment of independent component analysis of expression ratios from DNA microarrays.

作者信息

Kreil David Philip, MacKay David J C

机构信息

Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK.

出版信息

Comp Funct Genomics. 2003;4(3):300-17. doi: 10.1002/cfg.298.

Abstract

DNA microarrays allow the measurement of transcript abundances for thousands of genes in parallel. Most commonly, a particular sample of interest is studied next to a neutral control, examining relative changes (ratios). Independent component analysis (ICA) is a promising modern method for the analysis of such experiments. The condition of ICA algorithms can, however, depend on the characteristics of the data examined, making algorithm properties such as robustness specific to the given application domain. To address the lack of studies examining the robustness of ICA applied to microarray measurements, we report on the stability of variational Bayesian ICA in this domain. Microarray data are usually preprocessed and transformed. Hence we first examined alternative transforms and data selections for the smallest modelling reconstruction errors. Log-ratio data are reconstructed better than non-transformed ratio data by our linear model with a Gaussian error term. To compare ICA results we must allow for ICA invariance under rescaling and permutation of the extracted signatures, which hold the loadings of the original variables (gene transcript ratios) on particular latent variables. We introduced a method to optimally match corresponding signatures between sets of results. The stability of signatures was then examined after (1) repetition of the same analysis run with different random number generator seeds, and (2) repetition of the analysis with partial data sets. The effects of both dropping a proportion of the gene transcript ratios and dropping measurements for several samples have been studied. In summary, signatures with a high relative data power were very likely to be retained, resulting in an overall stability of the analyses. Our analysis of 63 yeast wildtype vs. wild-type experiments, moreover, yielded 10 reliably identified signatures, demonstrating that the variance observed is not just noise.

摘要

DNA微阵列可同时测量数千个基因的转录本丰度。最常见的是,将感兴趣的特定样本与中性对照相邻进行研究,以检查相对变化(比率)。独立成分分析(ICA)是一种很有前景的现代方法,用于分析此类实验。然而,ICA算法的条件可能取决于所检查数据的特征,使得诸如稳健性等算法属性特定于给定的应用领域。为了解决缺乏对应用于微阵列测量的ICA稳健性研究的问题,我们报告了变分贝叶斯ICA在此领域的稳定性。微阵列数据通常会进行预处理和转换。因此,我们首先研究了替代变换和数据选择,以获得最小的建模重建误差。通过带有高斯误差项的线性模型,对数比率数据比重建未变换的比率数据效果更好。为了比较ICA结果,我们必须考虑ICA在提取特征的重新缩放和排列下的不变性,这些特征保留了原始变量(基因转录本比率)在特定潜在变量上的载荷。我们引入了一种方法来最佳匹配结果集之间的相应特征。然后在以下情况下检查特征的稳定性:(1)使用不同的随机数生成器种子重复相同的分析运行,以及(2)使用部分数据集重复分析。研究了丢弃一定比例的基因转录本比率和丢弃几个样本的测量值的影响。总之,具有高相对数据功率的特征很可能被保留,从而导致分析的整体稳定性。此外,我们对63个酵母野生型与野生型实验的分析产生了10个可靠识别的特征,表明观察到的方差不仅仅是噪声。

相似文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验