• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

DNA微阵列数据分析中的显著性与统计误差

Significance and statistical errors in the analysis of DNA microarray data.

作者信息

Brody James P, Williams Brian A, Wold Barbara J, Quake Stephen R

机构信息

Departments of Applied Physics and Biology, California Institute of Technology, Pasadena, CA 91125, USA.

出版信息

Proc Natl Acad Sci U S A. 2002 Oct 1;99(20):12975-8. doi: 10.1073/pnas.162468199. Epub 2002 Sep 16.

DOI:10.1073/pnas.162468199
PMID:12235357
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC130571/
Abstract

DNA microarrays are important devices for high throughput measurements of gene expression, but no rational foundation has been established for understanding the sources of within-chip statistical error. We designed a specialized chip and protocol to investigate the distribution and magnitude of within-chip errors and discovered that, as expected from theoretical expectations, measurement errors follow a Lorentzian-like distribution, which explains the widely observed but unexplained ill-reproducibility in microarray data. Using this specially designed chip, we examined a data set of repeated measurements to extract estimates of the distribution and magnitude of statistical errors in DNA microarray measurements. Using the common "ratio of medians" method, we find that the measurements follow a Lorentzian-like distribution, which is problematic for subsequent analysis. We show that a method of analysis dubbed "median of ratios" yields a more Gaussian-like distribution of errors. Finally, we show that the bootstrap algorithm can be used to extract the best estimates of the error in the measurement. Quantifying the statistical error in such measurements has important applications for estimating significance levels, clustering algorithms, and process optimization.

摘要

DNA微阵列是用于基因表达高通量测量的重要设备,但尚未建立起理解芯片内统计误差来源的合理基础。我们设计了一种专门的芯片和方案来研究芯片内误差的分布和大小,发现正如理论预期的那样,测量误差遵循类似洛伦兹分布,这解释了在微阵列数据中广泛观察到但未得到解释的不可重复性。使用这种专门设计的芯片,我们检查了一组重复测量数据集,以提取DNA微阵列测量中统计误差的分布和大小估计值。使用常见的“中位数比率”方法,我们发现测量值遵循类似洛伦兹分布,这对后续分析存在问题。我们表明,一种称为“比率中位数”的分析方法会产生更类似高斯分布的误差。最后,我们表明可以使用自助算法来提取测量误差的最佳估计值。量化此类测量中的统计误差在估计显著性水平、聚类算法和过程优化方面具有重要应用。

相似文献

1
Significance and statistical errors in the analysis of DNA microarray data.DNA微阵列数据分析中的显著性与统计误差
Proc Natl Acad Sci U S A. 2002 Oct 1;99(20):12975-8. doi: 10.1073/pnas.162468199. Epub 2002 Sep 16.
2
An approach for clustering gene expression data with error information.一种用于对带有误差信息的基因表达数据进行聚类的方法。
BMC Bioinformatics. 2006 Jan 12;7:17. doi: 10.1186/1471-2105-7-17.
3
Evaluating concentration estimation errors in ELISA microarray experiments.评估酶联免疫吸附测定微阵列实验中的浓度估计误差。
BMC Bioinformatics. 2005 Jan 26;6:17. doi: 10.1186/1471-2105-6-17.
4
Bootstrap method for the estimation of measurement uncertainty in spotted dual-color DNA microarrays.用于估计斑点双色DNA微阵列测量不确定度的自助法
Anal Bioanal Chem. 2007 Dec;389(7-8):2125-41. doi: 10.1007/s00216-007-1617-0. Epub 2007 Sep 27.
5
Statistical analysis of high-density oligonucleotide arrays: a multiplicative noise model.高密度寡核苷酸阵列的统计分析:一种乘性噪声模型。
Bioinformatics. 2002 Dec;18(12):1633-40. doi: 10.1093/bioinformatics/18.12.1633.
6
Gaussian mixture clustering and imputation of microarray data.微阵列数据的高斯混合聚类与插补
Bioinformatics. 2004 Apr 12;20(6):917-23. doi: 10.1093/bioinformatics/bth007. Epub 2004 Jan 29.
7
Including probe-level uncertainty in model-based gene expression clustering.在基于模型的基因表达聚类中纳入探针水平的不确定性。
BMC Bioinformatics. 2007 Mar 21;8:98. doi: 10.1186/1471-2105-8-98.
8
Including probe-level measurement error in robust mixture clustering of replicated microarray gene expression.在复制微阵列基因表达的稳健混合聚类中纳入探针水平测量误差。
Stat Appl Genet Mol Biol. 2010;9:Article42. doi: 10.2202/1544-6115.1600. Epub 2010 Dec 9.
9
Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data.使用微阵列基因表达数据的用于疾病分类的核嵌入高斯过程。
BMC Bioinformatics. 2007 Feb 28;8:67. doi: 10.1186/1471-2105-8-67.
10
Interactively optimizing signal-to-noise ratios in expression profiling: project-specific algorithm selection and detection p-value weighting in Affymetrix microarrays.在表达谱分析中交互式优化信噪比:Affymetrix微阵列中特定项目的算法选择和检测p值加权
Bioinformatics. 2004 Nov 1;20(16):2534-44. doi: 10.1093/bioinformatics/bth280. Epub 2004 Apr 29.

引用本文的文献

1
A series acceleration algorithm for the gamma-Pareto (type I) convolution and related functions of interest for pharmacokinetics.一个用于伽马-Pareto(I 型)卷积和药代动力学相关感兴趣函数的系列加速算法。
J Pharmacokinet Pharmacodyn. 2022 Apr;49(2):191-208. doi: 10.1007/s10928-021-09779-4. Epub 2021 Oct 24.
2
The effect of uncertainty in patient classification on diagnostic performance estimations.患者分类不确定性对诊断性能评估的影响。
PLoS One. 2019 May 22;14(5):e0217146. doi: 10.1371/journal.pone.0217146. eCollection 2019.
3
Expression proteomics study to determine metallodrug targets and optimal drug combinations.表达蛋白质组学研究确定金属药物靶点和最佳药物组合。
Sci Rep. 2017 May 8;7(1):1590. doi: 10.1038/s41598-017-01643-1.
4
Ancient hot and cold genes and chemotherapy resistance emergence.古老的冷热基因与化疗耐药性的出现。
Proc Natl Acad Sci U S A. 2015 Aug 18;112(33):10467-72. doi: 10.1073/pnas.1512396112. Epub 2015 Aug 3.
5
Functional Identification of Target by Expression Proteomics (FITExP) reveals protein targets and highlights mechanisms of action of small molecule drugs.通过表达蛋白质组学进行靶点功能鉴定(FITExP)揭示了蛋白质靶点,并突出了小分子药物的作用机制。
Sci Rep. 2015 Jun 8;5:11176. doi: 10.1038/srep11176.
6
An analysis of critical factors for quantitative immunoblotting.定量免疫印迹关键因素分析
Sci Signal. 2015 Apr 7;8(371):rs2. doi: 10.1126/scisignal.2005966.
7
Spatial patterns of genome-wide expression profiles reflect anatomic and fiber connectivity architecture of healthy human brain.全基因组表达谱的空间模式反映了健康人类大脑的解剖结构和纤维连接结构。
Hum Brain Mapp. 2014 Aug;35(8):4204-18. doi: 10.1002/hbm.22471. Epub 2014 Feb 22.
8
Systematic spatial bias in DNA microarray hybridization is caused by probe spot position-dependent variability in lateral diffusion.系统的空间偏差在 DNA 微阵列杂交是由探针点位置依赖的横向扩散的可变性引起的。
PLoS One. 2011;6(8):e23727. doi: 10.1371/journal.pone.0023727. Epub 2011 Aug 17.
9
Microorganisms with novel dissimilatory (bi)sulfite reductase genes are widespread and part of the core microbiota in low-sulfate peatlands.具有新型异化(双)亚硫酸盐还原酶基因的微生物广泛存在于低硫酸盐泥炭地的核心微生物群中。
Appl Environ Microbiol. 2011 Feb;77(4):1231-42. doi: 10.1128/AEM.01352-10. Epub 2010 Dec 17.
10
Large-scale analysis of network bistability for human cancers.大规模分析人类癌症的网络双稳性。
PLoS Comput Biol. 2010 Jul 8;6(7):e1000851. doi: 10.1371/journal.pcbi.1000851.

本文引用的文献

1
Statistical data analysis in the computer age.计算机时代的统计数据分析。
Science. 1991 Jul 26;253(5018):390-5. doi: 10.1126/science.253.5018.390.
2
Significance analysis of microarrays applied to the ionizing radiation response.应用于电离辐射反应的微阵列显著性分析。
Proc Natl Acad Sci U S A. 2001 Apr 24;98(9):5116-21. doi: 10.1073/pnas.091062498. Epub 2001 Apr 17.
3
Importance of replication in microarray gene expression studies: statistical methods and evidence from repetitive cDNA hybridizations.微阵列基因表达研究中重复实验的重要性:统计方法及来自重复性cDNA杂交的证据
Proc Natl Acad Sci U S A. 2000 Aug 29;97(18):9834-9. doi: 10.1073/pnas.97.18.9834.
4
Multivariate measurement of gene expression relationships.基因表达关系的多变量测量
Genomics. 2000 Jul 15;67(2):201-9. doi: 10.1006/geno.2000.6241.
5
Prediction of gene function by genome-scale expression analysis: prostate cancer-associated genes.通过基因组规模表达分析预测基因功能:前列腺癌相关基因
Genome Res. 1999 Dec;9(12):1198-203. doi: 10.1101/gr.9.12.1198.
6
Systematic determination of genetic network architecture.基因网络架构的系统测定
Nat Genet. 1999 Jul;22(3):281-5. doi: 10.1038/10343.
7
DNA arrays for analysis of gene expression.用于基因表达分析的DNA阵列。
Methods Enzymol. 1999;303:179-205. doi: 10.1016/s0076-6879(99)03014-1.
8
Quantitative monitoring of gene expression patterns with a complementary DNA microarray.利用互补DNA微阵列对基因表达模式进行定量监测。
Science. 1995 Oct 20;270(5235):467-70. doi: 10.1126/science.270.5235.467.