Affymetrix 和 Illumina 基因表达微阵列实验中系统噪声关键来源的相对影响。

Relative impact of key sources of systematic noise in Affymetrix and Illumina gene-expression microarray experiments.

机构信息

Applied Bioinformatics of Cancer Group, Breakthrough Breast Cancer Research Unit, Institute of Genetics and Molecular Medicine, Crewe Road South, Edinburgh, Edinburgh, EH4 2XR, UK.

出版信息

BMC Genomics. 2011 Dec 1;12:589. doi: 10.1186/1471-2164-12-589.

DOI:10.1186/1471-2164-12-589

PMID:22133085

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3269440/

Abstract

BACKGROUND

Systematic processing noise, which includes batch effects, is very common in microarray experiments but is often ignored despite its potential to confound or compromise experimental results. Compromised results are most likely when re-analysing or integrating datasets from public repositories due to the different conditions under which each dataset is generated. To better understand the relative noise-contributions of various factors in experimental-design, we assessed several Illumina and Affymetrix datasets for technical variation between replicate hybridisations of Universal Human Reference (UHRR) and individual or pooled breast-tumour RNA.

RESULTS

A varying degree of systematic noise was observed in each of the datasets, however in all cases the relative amount of variation between standard control RNA replicates was found to be greatest at earlier points in the sample-preparation workflow. For example, 40.6% of the total variation in reported expressions were attributed to replicate extractions, compared to 13.9% due to amplification/labelling and 10.8% between replicate hybridisations. Deliberate probe-wise batch-correction methods were effective in reducing the magnitude of this variation, although the level of improvement was dependent on the sources of noise included in the model. Systematic noise introduced at the chip, run, and experiment levels of a combined Illumina dataset were found to be highly dependent upon the experimental design. Both UHRR and pools of RNA, which were derived from the samples of interest, modelled technical variation well although the pools were significantly better correlated (4% average improvement) and better emulated the effects of systematic noise, over all probes, than the UHRRs. The effect of this noise was not uniform over all probes, with low GC-content probes found to be more vulnerable to batch variation than probes with a higher GC-content.

CONCLUSIONS

The magnitude of systematic processing noise in a microarray experiment is variable across probes and experiments, however it is generally the case that procedures earlier in the sample-preparation workflow are liable to introduce the most noise. Careful experimental design is important to protect against noise, detailed meta-data should always be provided, and diagnostic procedures should be routinely performed prior to downstream analyses for the detection of bias in microarray studies.

摘要

背景

系统处理噪声，包括批次效应，在微阵列实验中非常常见，但由于其可能混淆或损害实验结果，通常被忽略。由于每个数据集生成的条件不同，因此在重新分析或整合来自公共存储库的数据时，最有可能出现受损的结果。为了更好地了解实验设计中各种因素的相对噪声贡献，我们评估了几个 Illumina 和 Affymetrix 数据集，以了解通用人类参考（UHRR）和个体或混合乳腺癌 RNA 的重复杂交之间的技术变异。

结果

在每个数据集都观察到了不同程度的系统噪声，但是在所有情况下，在样品制备工作流程的早期阶段，标准对照 RNA 重复之间的变化量被发现是最大的。例如，在报告的表达中，40.6%的总变化归因于重复提取，而 13.9%归因于扩增/标记，10.8%归因于重复杂交。故意进行探针级别的批量校正方法可以有效降低这种变化的幅度，尽管改进的程度取决于模型中包含的噪声源。在综合 Illumina 数据集的芯片、运行和实验水平上引入的系统噪声高度依赖于实验设计。UHRR 和从感兴趣的样本中衍生出的 RNA 池都很好地模拟了技术变化，尽管 RNA 池的相关性更好（平均提高了 4%），并且比 UHRR 更能模拟系统噪声的影响，在所有探针上都更好地模拟了系统噪声的影响。这种噪声的影响并不是在所有探针上都是均匀的，低 GC 含量的探针比高 GC 含量的探针更容易受到批次变化的影响。

结论

微阵列实验中的系统处理噪声的幅度在探针和实验之间是可变的，但是通常情况下，样品制备工作流程早期的步骤更容易引入噪声。仔细的实验设计对于防止噪声很重要，应该始终提供详细的元数据，并且应该在下游分析之前例行执行诊断程序，以检测微阵列研究中的偏差。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/360f/3269440/207a1feb3266/1471-2164-12-589-1.jpg

相似文献

Relative impact of key sources of systematic noise in Affymetrix and Illumina gene-expression microarray experiments.Affymetrix 和 Illumina 基因表达微阵列实验中系统噪声关键来源的相对影响。

BMC Genomics. 2011 Dec 1;12:589. doi: 10.1186/1471-2164-12-589.

Correcting for intra-experiment variation in Illumina BeadChip data is necessary to generate robust gene-expression profiles.纠正 Illumina BeadChip 数据中的实验内变异性对于生成稳健的基因表达谱是必要的。

BMC Genomics. 2010 Feb 24;11:134. doi: 10.1186/1471-2164-11-134.

A revised design for microarray experiments to account for experimental noise and uncertainty of probe response.一种用于微阵列实验的修订设计，以考虑实验噪声和探针响应的不确定性。

PLoS One. 2014 Mar 11;9(3):e91295. doi: 10.1371/journal.pone.0091295. eCollection 2014.

Universal Reference RNA as a standard for microarray experiments.通用参考RNA作为微阵列实验的标准。

BMC Genomics. 2004 Mar 9;5(1):20. doi: 10.1186/1471-2164-5-20.

Comparison of gene expression microarray data with count-based RNA measurements informs microarray interpretation.将基因表达微阵列数据与基于计数的RNA测量结果进行比较有助于微阵列的解读。

BMC Genomics. 2014 Aug 4;15(1):649. doi: 10.1186/1471-2164-15-649.

Interactively optimizing signal-to-noise ratios in expression profiling: project-specific algorithm selection and detection p-value weighting in Affymetrix microarrays.在表达谱分析中交互式优化信噪比：Affymetrix微阵列中特定项目的算法选择和检测p值加权

Bioinformatics. 2004 Nov 1;20(16):2534-44. doi: 10.1093/bioinformatics/bth280. Epub 2004 Apr 29.

Statistical analysis of high-density oligonucleotide arrays: a multiplicative noise model.高密度寡核苷酸阵列的统计分析：一种乘性噪声模型。

Bioinformatics. 2002 Dec;18(12):1633-40. doi: 10.1093/bioinformatics/18.12.1633.

Sources of variation in Affymetrix microarray experiments.Affymetrix微阵列实验中的变异来源。

BMC Bioinformatics. 2005 Aug 29;6:214. doi: 10.1186/1471-2105-6-214.

A new non-linear normalization method for reducing variability in DNA microarray experiments.一种用于减少DNA微阵列实验变异性的新型非线性归一化方法。

Genome Biol. 2002 Aug 30;3(9):research0048. doi: 10.1186/gb-2002-3-9-research0048.

Redefinition of Affymetrix probe sets by sequence overlap with cDNA microarray probes reduces cross-platform inconsistencies in cancer-associated gene expression measurements.通过与cDNA微阵列探针的序列重叠来重新定义Affymetrix探针集，可减少癌症相关基因表达测量中跨平台的不一致性。

BMC Bioinformatics. 2005 Apr 25;6:107. doi: 10.1186/1471-2105-6-107.

引用本文的文献

Batch-effect detection, correction and characterisation in Illumina HumanMethylation450 and MethylationEPIC BeadChip array data.Illumina HumanMethylation450 和 MethylationEPIC BeadChip 阵列数据中的批次效应检测、校正和特征描述。

Clin Epigenetics. 2022 Apr 29;14(1):58. doi: 10.1186/s13148-022-01277-9.

Identification of early liver toxicity gene biomarkers using comparative supervised machine learning.采用对比监督机器学习方法识别早期肝毒性基因生物标志物

Sci Rep. 2020 Nov 5;10(1):19128. doi: 10.1038/s41598-020-76129-8.

Unlocking the transcriptomic potential of formalin-fixed paraffin embedded clinical tissues: comparison of gene expression profiling approaches.解析福尔马林固定石蜡包埋临床组织的转录组潜力：基因表达谱分析方法的比较。

BMC Bioinformatics. 2020 Jan 28;21(1):30. doi: 10.1186/s12859-020-3365-5.

Multi-view based integrative analysis of gene expression data for identifying biomarkers.基于多视图的基因表达数据综合分析鉴定生物标志物。

Sci Rep. 2019 Sep 18;9(1):13504. doi: 10.1038/s41598-019-49967-4.

ALS blood expression profiling identifies new biomarkers, patient subgroups, and evidence for neutrophilia and hypoxia.肌萎缩侧索硬化症的血液表达谱分析确定了新的生物标志物、患者亚组，以及嗜中性粒细胞增多和缺氧的证据。

J Transl Med. 2019 May 22;17(1):170. doi: 10.1186/s12967-019-1909-0.

Comparison of multiple transcriptomes exposes unified and divergent features of quiescent and activated skeletal muscle stem cells.比较多个转录组揭示了静止和激活的骨骼肌干细胞的统一和分歧特征。

Skelet Muscle. 2017 Dec 22;7(1):28. doi: 10.1186/s13395-017-0144-8.

Microarray Meta-Analysis and Cross-Platform Normalization: Integrative Genomics for Robust Biomarker Discovery.微阵列元分析与跨平台归一化：用于可靠生物标志物发现的整合基因组学

Microarrays (Basel). 2015 Aug 21;4(3):389-406. doi: 10.3390/microarrays4030389.

Gene expression prediction using low-rank matrix completion.使用低秩矩阵补全进行基因表达预测。

BMC Bioinformatics. 2016 Jun 17;17(1):243. doi: 10.1186/s12859-016-1106-6.

Comparative Analysis of Matrix Metalloproteinase Family Members Reveals That MMP9 Predicts Survival and Response to Temozolomide in Patients with Primary Glioblastoma.基质金属蛋白酶家族成员的比较分析表明，MMP9可预测原发性胶质母细胞瘤患者的生存期及对替莫唑胺的反应。

PLoS One. 2016 Mar 29;11(3):e0151815. doi: 10.1371/journal.pone.0151815. eCollection 2016.

Methods that remove batch effects while retaining group differences may lead to exaggerated confidence in downstream analyses.在保留组间差异的同时消除批次效应的方法可能会导致对下游分析的信心过度膨胀。

Biostatistics. 2016 Jan;17(1):29-39. doi: 10.1093/biostatistics/kxv027. Epub 2015 Aug 13.

本文引用的文献

Gene expression profiles from formalin fixed paraffin embedded breast cancer tissue are largely comparable to fresh frozen matched tissue.福尔马林固定石蜡包埋的乳腺癌组织的基因表达谱与新鲜冷冻匹配组织基本可比。

PLoS One. 2011 Feb 11;6(2):e17163. doi: 10.1371/journal.pone.0017163.

Optimizing the noise versus bias trade-off for Illumina whole genome expression BeadChips.优化 Illumina 全基因组表达 BeadChips 的噪声与偏差权衡。

Nucleic Acids Res. 2010 Dec;38(22):e204. doi: 10.1093/nar/gkq871. Epub 2010 Oct 6.

Tackling the widespread and critical impact of batch effects in high-throughput data.解决高通量数据中广泛存在且极具影响力的批次效应问题。

Nat Rev Genet. 2010 Oct;11(10):733-9. doi: 10.1038/nrg2825. Epub 2010 Sep 14.

Gene expression profiling of response to mTOR inhibitor everolimus in pre-operatively treated post-menopausal women with oestrogen receptor-positive breast cancer.绝经后女性雌激素受体阳性乳腺癌患者术前应用 mTOR 抑制剂依维莫司治疗的反应的基因表达谱分析。

Breast Cancer Res Treat. 2010 Jul;122(2):419-28. doi: 10.1007/s10549-010-0928-6. Epub 2010 May 18.

BMC Genomics. 2010 Feb 24;11:134. doi: 10.1186/1471-2164-11-134.

Statistical aspects of quantitative real-time PCR experiment design.定量实时 PCR 实验设计的统计方面。

Methods. 2010 Apr;50(4):231-6. doi: 10.1016/j.ymeth.2010.01.025. Epub 2010 Jan 28.

A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data.Illumina BeadArray 重注释管道：改善基因表达数据的解读。

Nucleic Acids Res. 2010 Jan;38(3):e17. doi: 10.1093/nar/gkp942. Epub 2009 Nov 18.

Design and optimization of reverse-transcription quantitative PCR experiments.逆转录定量PCR实验的设计与优化

Clin Chem. 2009 Oct;55(10):1816-23. doi: 10.1373/clinchem.2009.126201. Epub 2009 Jul 30.

Importance of randomization in microarray experimental designs with Illumina platforms.在使用Illumina平台的微阵列实验设计中随机化的重要性。

Nucleic Acids Res. 2009 Sep;37(17):5610-8. doi: 10.1093/nar/gkp573. Epub 2009 Jul 17.

Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.短DNA序列与人类基因组的超快速且内存高效比对。

Genome Biol. 2009;10(3):R25. doi: 10.1186/gb-2009-10-3-r25. Epub 2009 Mar 4.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Affymetrix 和 Illumina 基因表达微阵列实验中系统噪声关键来源的相对影响。

Relative impact of key sources of systematic noise in Affymetrix and Illumina gene-expression microarray experiments.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献