Suppr超能文献

微阵列分析中的质量加权均值和T检验可提高基因表达测量的准确性,并减少差异表达检测中的I型和II型错误。

Quality Weighted Mean and T-test in Microarray Analysis Lead to Improved Accuracy in Gene Expression Measurements and Reduced Type I and II Errors in Differential Expression Detection.

作者信息

Gao Shouguo, Jia Shuang, Hessner Martin, Wang Xujing

机构信息

Department of Physics & the Comprehensive Diabetes Center, University of Alabama at Birmingham, 1300 University Blvd, Birmingham, AL 35294, USA.

出版信息

J Comput Sci Syst Biol. 2008 Dec 26;1:41. doi: 10.4172/jcsb.1000003.

Abstract

Previously we have reported a microarray image processing and data analysis package Matarray, where quality scores are defined for every spot that reflect the reliability and variability of the data acquired from each spot. In this article we present a new development in Matarray, where the quality scores are incorporated as weights in the statistical evaluation and data mining of microarray data. With this approach filtering of poor quality data is automatically achieved through the reduction in their weights, thereby eliminating the need to manually flag or remove bad data points, as well as the problem of missing values. More significantly, utilizing a set of control clones spiked in at known input ratios ranging from 1:30 to 30:1, we find that the quality-weighted statistics leads to more accurate gene expression measurements and more sensitive detection of their changes with significantly lower type II error rates. Further, we have applied the quality-weighted clustering to a time-course microarray data set, and find that the new algorithm improves grouping accuracy. In summary, incorporating quantitative quality measure of microarray data as weight in complex data analysis leads to improved reliability and convenience. In addition it provides a practical way to deal with the missing value issue in establishing automatic statistical tests.

摘要

此前我们报道过一个微阵列图像处理和数据分析软件包Matarray,其中为每个点定义了质量分数,这些分数反映了从每个点获取的数据的可靠性和可变性。在本文中,我们展示了Matarray的一项新进展,即质量分数被用作微阵列数据统计评估和数据挖掘中的权重。通过这种方法,低质量数据会因其权重降低而自动被过滤,从而无需手动标记或删除不良数据点,也避免了缺失值问题。更重要的是,利用一组以1:30至30:1的已知输入比例掺入的对照克隆,我们发现质量加权统计能带来更准确的基因表达测量结果,并能更灵敏地检测其变化,同时显著降低II型错误率。此外,我们将质量加权聚类应用于一个时间进程微阵列数据集,发现新算法提高了分组准确性。总之,在复杂数据分析中,将微阵列数据的定量质量度量作为权重纳入,可提高可靠性和便利性。此外,它还为在建立自动统计测试时处理缺失值问题提供了一种实用方法。

相似文献

3
Improving missing value imputation of microarray data by using spot quality weights.
BMC Bioinformatics. 2006 Jun 16;7:306. doi: 10.1186/1471-2105-7-306.
4
A Combinational Clustering Based Method for cDNA Microarray Image Segmentation.
PLoS One. 2015 Aug 4;10(8):e0133025. doi: 10.1371/journal.pone.0133025. eCollection 2015.
6
Quantitative quality control in microarray image processing and data acquisition.
Nucleic Acids Res. 2001 Aug 1;29(15):E75-5. doi: 10.1093/nar/29.15.e75.
7
Advanced spot quality analysis in two-colour microarray experiments.
BMC Res Notes. 2008 Sep 17;1:80. doi: 10.1186/1756-0500-1-80.
8
[Standard technical specifications for methacholine chloride (Methacholine) bronchial challenge test (2023)].
Zhonghua Jie He He Hu Xi Za Zhi. 2024 Feb 12;47(2):101-119. doi: 10.3760/cma.j.cn112147-20231019-00247.
9
Use of signal quality measurements to gain efficiency in the analysis of cDNA microarray data.
J Genet Genomics. 2010 Apr;37(4):265-79. doi: 10.1016/S1673-8527(09)60045-X.
10
Weighted analysis of microarray gene expression using maximum-likelihood.
Bioinformatics. 2005 Mar;21(6):723-9. doi: 10.1093/bioinformatics/bti051. Epub 2004 Sep 28.

本文引用的文献

2
An approach for clustering gene expression data with error information.
BMC Bioinformatics. 2006 Jan 12;7:17. doi: 10.1186/1471-2105-7-17.
3
Microarray data analysis: from disarray to consolidation and consensus.
Nat Rev Genet. 2006 Jan;7(1):55-65. doi: 10.1038/nrg1749.
5
A novel approach for high-quality microarray processing using third-dye array visualization technology.
IEEE Trans Nanobioscience. 2003 Dec;2(4):193-201. doi: 10.1109/tnb.2003.816233.
6
Modeling microarray data using a threshold mixture model.
Biometrics. 2004 Jun;60(2):376-87. doi: 10.1111/j.0006-341X.2004.00182.x.
8
Detecting differential gene expression with a semiparametric hierarchical mixture method.
Biostatistics. 2004 Apr;5(2):155-76. doi: 10.1093/biostatistics/5.2.155.
9
LSimpute: accurate estimation of missing values in microarray data with least squares methods.
Nucleic Acids Res. 2004 Feb 20;32(3):e34. doi: 10.1093/nar/gnh026.
10
Gaussian mixture clustering and imputation of microarray data.
Bioinformatics. 2004 Apr 12;20(6):917-23. doi: 10.1093/bioinformatics/bth007. Epub 2004 Jan 29.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验