Suppr超能文献

一种在微阵列实验中识别差异表达基因的有效方法。

An efficient method to identify differentially expressed genes in microarray experiments.

作者信息

Qin Huaizhen, Feng Tao, Harding Scott A, Tsai Chung-Jui, Zhang Shuanglin

机构信息

Department of Mathematical Sciences, Biotechnology Research Center, School of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI 49931, USA.

出版信息

Bioinformatics. 2008 Jul 15;24(14):1583-9. doi: 10.1093/bioinformatics/btn215. Epub 2008 May 3.

Abstract

MOTIVATION

Microarray experiments typically analyze thousands to tens of thousands of genes from small numbers of biological replicates. The fact that genes are normally expressed in functionally relevant patterns suggests that gene-expression data can be stratified and clustered into relatively homogenous groups. Cluster-wise dimensionality reduction should make it feasible to improve screening power while minimizing information loss.

RESULTS

We propose a powerful and computationally simple method for finding differentially expressed genes in small microarray experiments. The method incorporates a novel stratification-based tight clustering algorithm, principal component analysis and information pooling. Comprehensive simulations show that our method is substantially more powerful than the popular SAM and eBayes approaches. We applied the method to three real microarray datasets: one from a Populus nitrogen stress experiment with 3 biological replicates; and two from public microarray datasets of human cancers with 10 to 40 biological replicates. In all three analyses, our method proved more robust than the popular alternatives for identification of differentially expressed genes.

AVAILABILITY

The C++ code to implement the proposed method is available upon request for academic use.

摘要

动机

微阵列实验通常从少量生物重复样本中分析数千到数万个基因。基因通常以功能相关模式表达这一事实表明,基因表达数据可以分层并聚类为相对同质的组。基于聚类的降维应该能够在最小化信息损失的同时提高筛选能力。

结果

我们提出了一种强大且计算简单的方法,用于在小型微阵列实验中寻找差异表达基因。该方法结合了一种基于分层的紧密聚类新算法、主成分分析和信息合并。全面的模拟表明,我们的方法比流行的SAM和eBayes方法强大得多。我们将该方法应用于三个真实的微阵列数据集:一个来自杨树氮胁迫实验,有3个生物重复样本;另外两个来自人类癌症的公共微阵列数据集,有10到40个生物重复样本。在所有这三项分析中,我们的方法在识别差异表达基因方面比流行的替代方法更稳健。

可用性

如需学术使用,可根据请求提供实现所提方法的C++代码。

相似文献

本文引用的文献

4
Estimating p-values in small microarray experiments.在小型微阵列实验中估计p值。
Bioinformatics. 2007 Jan 1;23(1):38-43. doi: 10.1093/bioinformatics/btl548. Epub 2006 Oct 30.
5
Estimation of false discovery proportion under general dependence.一般相关性下错误发现比例的估计
Bioinformatics. 2006 Dec 15;22(24):3025-31. doi: 10.1093/bioinformatics/btl527. Epub 2006 Oct 17.
6
8
What should be expected from feature selection in small-sample settings.在小样本情况下,特征选择应达到什么预期效果。
Bioinformatics. 2006 Oct 1;22(19):2430-6. doi: 10.1093/bioinformatics/btl407. Epub 2006 Jul 26.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验