Suppr超能文献

用于高通量基因组数据的CART方差稳定化与正则化

CART variance stabilization and regularization for high-throughput genomic data.

作者信息

Papana Ariadni, Ishwaran Hemant

机构信息

Department of Statistics, Case University, 10900 Euclid Avenue Cleveland OH 44106, USA.

出版信息

Bioinformatics. 2006 Sep 15;22(18):2254-61. doi: 10.1093/bioinformatics/btl384. Epub 2006 Jul 14.

Abstract

MOTIVATION

mRNA expression data obtained from high-throughput DNA microarrays exhibit strong departures from homogeneity of variances. Often a complex relationship between mean expression value and variance is seen. Variance stabilization of such data is crucial for many types of statistical analyses, while regularization of variances (pooling of information) can greatly improve overall accuracy of test statistics.

RESULTS

A Classification and Regression Tree (CART) procedure is introduced for variance stabilization as well as regularization. The CART procedure adaptively clusters genes by variances. Using both local and cluster wide information leads to improved estimation of population variances which improves test statistics. Whereas making use of cluster wide information allows for variance stabilization of data.

AVAILABILITY

Sufficient details for our CART procedure are given so that the interested reader can program the method for themselves. The algorithm is also accessible within the Java software package BAMarray(TM), which is freely available to non-commercial users at www.bamarray.com.

CONTACT

hemant.ishwaran@gmail.com.

摘要

动机

从高通量DNA微阵列获得的mRNA表达数据显示出方差齐性的强烈偏离。通常会观察到平均表达值与方差之间存在复杂关系。对于许多类型的统计分析而言,此类数据的方差稳定化至关重要,而方差正则化(信息合并)可极大提高检验统计量的总体准确性。

结果

引入了一种分类回归树(CART)程序用于方差稳定化以及正则化。CART程序通过方差对基因进行自适应聚类。利用局部和聚类范围内的信息可改进总体方差的估计,从而改进检验统计量。而利用聚类范围内的信息可实现数据的方差稳定化。

可用性

给出了我们CART程序的足够详细信息,以便感兴趣的读者能够自行编写该方法的程序。该算法也可在Java软件包BAMarray(TM)中获取,非商业用户可在www.bamarray.com上免费使用。

联系方式

hemant.ishwaran@gmail.com

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验