探针集算法：是否存在合理的最佳选择？

Probe set algorithms: is there a rational best bet?

作者信息

Seo Jinwook, Hoffman Eric P

机构信息

Research Center for Genetic Medicine, Children's National Medical Center, 111 Michigan Ave NW, Washington DC 20010, USA.

出版信息

BMC Bioinformatics. 2006 Aug 30;7:395. doi: 10.1186/1471-2105-7-395.

DOI:10.1186/1471-2105-7-395

PMID:16942624

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1569879/

Abstract

Affymetrix microarrays have become a standard experimental platform for studies of mRNA expression profiling. Their success is due, in part, to the multiple oligonucleotide features (probes) against each transcript (probe set). This multiple testing allows for more robust background assessments and gene expression measures, and has permitted the development of many computational methods to translate image data into a single normalized "signal" for mRNA transcript abundance. There are now many probe set algorithms that have been developed, with a gradual movement away from chip-by-chip methods (MAS5), to project-based model-fitting methods (dCHIP, RMA, others). Data interpretation is often profoundly changed by choice of algorithm, with disoriented biologists questioning what the "accurate" interpretation of their experiment is. Here, we summarize the debate concerning probe set algorithms. We provide examples of how changes in mismatch weight, normalizations, and construction of expression ratios each dramatically change data interpretation. All interpretations can be considered as computationally appropriate, but with varying biological credibility. We also illustrate the performance of two new hybrid algorithms (PLIER, GC-RMA) relative to more traditional algorithms (dCHIP, MAS5, Probe Profiler PCA, RMA) using an interactive power analysis tool. PLIER appears superior to other algorithms in avoiding false positives with poorly performing probe sets. Based on our interpretation of the literature, and examples presented here, we suggest that the variability in performance of probe set algorithms is more dependent upon assumptions regarding "background", than on calculations of "signal". We argue that "background" is an enormously complex variable that can only be vaguely quantified, and thus the "best" probe set algorithm will vary from project to project.

摘要

Affymetrix微阵列已成为mRNA表达谱研究的标准实验平台。其成功部分归功于针对每个转录本（探针集）的多个寡核苷酸特征（探针）。这种多重检测允许进行更稳健的背景评估和基因表达测量，并促使许多计算方法得以发展，从而将图像数据转化为用于mRNA转录本丰度的单一标准化“信号”。目前已经开发出了许多探针集算法，并且逐渐从逐芯片方法（MAS5）转向基于项目的模型拟合方法（dCHIP、RMA等）。算法的选择常常会深刻改变数据解读方式，这使得困惑的生物学家质疑他们实验的“准确”解读是什么。在这里，我们总结了关于探针集算法的争论。我们举例说明错配权重、标准化以及表达比值构建的变化如何各自显著改变数据解读。所有解读在计算上都可被视为合适的，但具有不同的生物学可信度。我们还使用交互式功效分析工具说明了两种新的混合算法（PLIER、GC-RMA）相对于更传统算法（dCHIP、MAS5、探针分析器主成分分析、RMA）的性能。在避免性能不佳的探针集产生假阳性方面，PLIER似乎优于其他算法。基于我们对文献的解读以及此处给出的示例，我们认为探针集算法性能的变异性更多地取决于关于“背景”的假设，而非“信号”的计算。我们认为“背景”是一个极其复杂的变量，只能进行模糊量化，因此“最佳”探针集算法会因项目而异。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f06e/1569879/93d775b7d3ca/1471-2105-7-395-1.jpg

相似文献

Probe set algorithms: is there a rational best bet?

BMC Bioinformatics. 2006 Aug 30;7:395. doi: 10.1186/1471-2105-7-395.

Interactively optimizing signal-to-noise ratios in expression profiling: project-specific algorithm selection and detection p-value weighting in Affymetrix microarrays.

Bioinformatics. 2004 Nov 1;20(16):2534-44. doi: 10.1093/bioinformatics/bth280. Epub 2004 Apr 29.

Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data.

BMC Bioinformatics. 2005 Feb 10;6:26. doi: 10.1186/1471-2105-6-26.

PQN and DQN: algorithms for expression microarrays.

J Theor Biol. 2006 Nov 21;243(2):273-8. doi: 10.1016/j.jtbi.2006.06.017. Epub 2006 Jun 30.

Utilization of two sample t-test statistics from redundant probe sets to evaluate different probe set algorithms in GeneChip studies.

BMC Bioinformatics. 2006 Jan 10;7:12. doi: 10.1186/1471-2105-7-12.

An interactive power analysis tool for microarray hypothesis testing and generation.

Bioinformatics. 2006 Apr 1;22(7):808-14. doi: 10.1093/bioinformatics/btk052. Epub 2006 Jan 17.

Calculation of reliable transcript levels of annotated genes on the basis of multiple probe-sets in Affymetrix microarrays.

Acta Biochim Pol. 2009;56(2):271-7. Epub 2009 May 12.

Calibration of microarray gene-expression data.

Methods Mol Biol. 2010;576:375-407. doi: 10.1007/978-1-59745-545-9_20.

Evaluation of methods for oligonucleotide array data via quantitative real-time PCR.

BMC Bioinformatics. 2006 Jan 17;7:23. doi: 10.1186/1471-2105-7-23.

Effects of filtering by Present call on analysis of microarray experiments.

BMC Bioinformatics. 2006 Jan 31;7:49. doi: 10.1186/1471-2105-7-49.

引用本文的文献

Identification of new potential downstream transcriptional targets of the strigolactone pathway including glucosinolate biosynthesis.

Plant Direct. 2023 Mar 19;7(3):e486. doi: 10.1002/pld3.486. eCollection 2023 Mar.

Relationship between insulin sensitivity and gene expression in human skeletal muscle.

BMC Endocr Disord. 2021 Feb 27;21(1):32. doi: 10.1186/s12902-021-00687-9.

Transcriptome profiling of anti-müllerian hormone treated preantral/small antral mouse ovary follicles.

Oncotarget. 2018 Jul 13;9(54):30253-30267. doi: 10.18632/oncotarget.25572.

Identification of genes related to consecutive trauma-induced sepsis via gene expression profiling analysis.

Medicine (Baltimore). 2018 Apr;97(15):e0362. doi: 10.1097/MD.0000000000010362.

Gene expression profiling analysis of keloids with and without hydrocortisone treatment.

Exp Ther Med. 2017 Dec;14(6):5283-5288. doi: 10.3892/etm.2017.5263. Epub 2017 Oct 3.

PCB exposure and potential future cancer incidence in Slovak children: an assessment from molecular finger printing by Ingenuity Pathway Analysis (IPA®) derived from experimental and epidemiological investigations.

Environ Sci Pollut Res Int. 2018 Jun;25(17):16493-16507. doi: 10.1007/s11356-017-0149-1. Epub 2017 Nov 15.

Comparative transcriptomics indicate changes in cell wall organization and stress response in seedlings during spaceflight.

Am J Bot. 2017 Aug;104(8):1219-1231. doi: 10.3732/ajb.1700079.

Identification of genes associated with tongue cancer in patients with a history of tobacco and/or alcohol use.

Oncol Lett. 2017 Feb;13(2):629-638. doi: 10.3892/ol.2016.5497. Epub 2016 Dec 14.

Crucial genes associated with diabetic nephropathy explored by microarray analysis.

BMC Nephrol. 2016 Sep 9;17(1):128. doi: 10.1186/s12882-016-0343-2.

Transcriptional profiling and biological pathway analysis of human equivalence PCB exposure in vitro: indicator of disease and disorder development in humans.

Environ Res. 2015 Apr;138:202-16. doi: 10.1016/j.envres.2014.12.031. Epub 2015 Feb 27.

本文引用的文献

How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results.

BMC Bioinformatics. 2006 Mar 15;7:137. doi: 10.1186/1471-2105-7-137.

Nuclear envelope dystrophies show a transcriptional fingerprint suggesting disruption of Rb-MyoD pathways in muscle regeneration.

Brain. 2006 Apr;129(Pt 4):996-1013. doi: 10.1093/brain/awl023. Epub 2006 Feb 14.

Effects of filtering by Present call on analysis of microarray experiments.

BMC Bioinformatics. 2006 Jan 31;7:49. doi: 10.1186/1471-2105-7-49.

Comparison of algorithms for the analysis of Affymetrix microarray data as evaluated by co-expression of genes in known operons.

Nucleic Acids Res. 2006 Jan 23;34(2):e8. doi: 10.1093/nar/gnj010.

An interactive power analysis tool for microarray hypothesis testing and generation.

Bioinformatics. 2006 Apr 1;22(7):808-14. doi: 10.1093/bioinformatics/btk052. Epub 2006 Jan 17.

Comparison of Affymetrix GeneChip expression measures.

Bioinformatics. 2006 Apr 1;22(7):789-94. doi: 10.1093/bioinformatics/btk046. Epub 2006 Jan 12.

Stochastic models inspired by hybridization theory for short oligonucleotide arrays.

J Comput Biol. 2005 Jul-Aug;12(6):882-93. doi: 10.1089/cmb.2005.12.882.

Comparison of preprocessing procedures for oligo-nucleotide micro-arrays by parametric bootstrap simulation of spike-in experiments.

Methods Inf Med. 2004;43(5):434-8.

Improved parameter estimation for variance-stabilizing transformation of gene-expression microarray data.

J Bioinform Comput Biol. 2004 Dec;2(4):669-79. doi: 10.1142/s0219720004000806.

Experimental design and low-level analysis of microarray data.

Int Rev Neurobiol. 2004;60:25-58. doi: 10.1016/S0074-7742(04)60002-X.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

探针集算法：是否存在合理的最佳选择？

Probe set algorithms: is there a rational best bet?

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献