文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

重新审视Affymetrix基因表达数据中交叉杂交的不良影响:它们对相关性分析有影响吗?

Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?

作者信息

Klebanov Lev, Chen Linlin, Yakovlev Andrei

机构信息

Department of Biostatistics and Computational Biology, University of Rochester, 601 Elmwood Avenue, Rochester, Box 630, New York 14642, USA.

出版信息

Biol Direct. 2007 Nov 7;2:28. doi: 10.1186/1745-6150-2-28.


DOI:10.1186/1745-6150-2-28
PMID:17988401
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2211459/
Abstract

BACKGROUND: This work was undertaken in response to a recently published paper by Okoniewski and Miller (BMC Bioinformatics 2006, 7: Article 276). The authors of that paper came to the conclusion that the process of multiple targeting in short oligonucleotide microarrays induces spurious correlations and this effect may deteriorate the inference on correlation coefficients. The design of their study and supporting simulations cast serious doubt upon the validity of this conclusion. The work by Okoniewski and Miller drove us to revisit the issue by means of experimentation with biological data and probabilistic modeling of cross-hybridization effects. RESULTS: We have identified two serious flaws in the study by Okoniewski and Miller: (1) The data used in their paper are not amenable to correlation analysis; (2) The proposed simulation model is inadequate for studying the effects of cross-hybridization. Using two other data sets, we have shown that removing multiply targeted probe sets does not lead to a shift in the histogram of sample correlation coefficients towards smaller values. A more realistic approach to mathematical modeling of cross-hybridization demonstrates that this process is by far more complex than the simplistic model considered by the authors. A diversity of correlation effects (such as the induction of positive or negative correlations) caused by cross-hybridization can be expected in theory but there are natural limitations on the ability to provide quantitative insights into such effects due to the fact that they are not directly observable. CONCLUSION: The proposed stochastic model is instrumental in studying general regularities in hybridization interaction between probe sets in microarray data. As the problem stands now, there is no compelling reason to believe that multiple targeting causes a large-scale effect on the correlation structure of Affymetrix gene expression data. Our analysis suggests that the observed long-range correlations in microarray data are of a biological nature rather than a technological flaw.

摘要

背景:本研究是针对奥科涅夫斯基和米勒最近发表的一篇论文(《BMC生物信息学》2006年,7卷:第276号文章)而开展的。该论文的作者得出结论,短寡核苷酸微阵列中的多重靶向过程会引发虚假相关性,且这种效应可能会削弱对相关系数的推断。他们的研究设计及支持性模拟对这一结论的有效性提出了严重质疑。奥科涅夫斯基和米勒的研究促使我们通过对生物数据进行实验以及对交叉杂交效应进行概率建模来重新审视这一问题。 结果:我们在奥科涅夫斯基和米勒的研究中发现了两个严重缺陷:(1)他们论文中使用的数据不适用于相关性分析;(2)所提出的模拟模型不足以研究交叉杂交的影响。使用另外两个数据集,我们表明去除多重靶向的探针集并不会导致样本相关系数的直方图向较小值偏移。一种更现实的交叉杂交数学建模方法表明,这个过程远比作者所考虑的简单模型复杂得多。理论上可以预期交叉杂交会产生多种相关效应(例如正相关或负相关的诱导),但由于这些效应无法直接观察到,因此在对其进行定量洞察的能力方面存在天然限制。 结论:所提出的随机模型有助于研究微阵列数据中探针集之间杂交相互作用的一般规律。就目前的问题而言,没有令人信服的理由相信多重靶向会对Affymetrix基因表达数据的相关结构产生大规模影响。我们的分析表明,在微阵列数据中观察到的长程相关性具有生物学性质,而非技术缺陷。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0a4/2211459/630a326f9f43/1745-6150-2-28-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0a4/2211459/3e3ef4079170/1745-6150-2-28-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0a4/2211459/880e834948fb/1745-6150-2-28-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0a4/2211459/fde2bef6b926/1745-6150-2-28-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0a4/2211459/1c5d82b12fdd/1745-6150-2-28-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0a4/2211459/630a326f9f43/1745-6150-2-28-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0a4/2211459/3e3ef4079170/1745-6150-2-28-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0a4/2211459/880e834948fb/1745-6150-2-28-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0a4/2211459/fde2bef6b926/1745-6150-2-28-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0a4/2211459/1c5d82b12fdd/1745-6150-2-28-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a0a4/2211459/630a326f9f43/1745-6150-2-28-5.jpg

相似文献

[1]
Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?

Biol Direct. 2007-11-7

[2]
A model of binding on DNA microarrays: understanding the combined effect of probe synthesis failure, cross-hybridization, DNA fragmentation and other experimental details of affymetrix arrays.

BMC Genomics. 2012-12-27

[3]
Stochastic models inspired by hybridization theory for short oligonucleotide arrays.

J Comput Biol. 2005

[4]
Cross-species analysis of gene expression in non-model mammals: reproducibility of hybridization on high density oligonucleotide microarrays.

BMC Genomics. 2007-4-3

[5]
Redefinition of Affymetrix probe sets by sequence overlap with cDNA microarray probes reduces cross-platform inconsistencies in cancer-associated gene expression measurements.

BMC Bioinformatics. 2005-4-25

[6]
The effects of normalization on the correlation structure of microarray data.

BMC Bioinformatics. 2005-5-16

[7]
GenXHC: a probabilistic generative model for cross-hybridization compensation in high-density genome-wide microarray data.

Bioinformatics. 2005-6

[8]
Exploring drug action on Mycobacterium tuberculosis using affymetrix oligonucleotide genechips.

Tuberculosis (Edinb). 2006-3

[9]
An analysis of the use of genomic DNA as a universal reference in two channel DNA microarrays.

BMC Genomics. 2005-5-8

[10]
Relationship between gene expression and observed intensities in DNA microarrays--a modeling study.

Nucleic Acids Res. 2006-5-24

引用本文的文献

[1]
A Pipeline for High-Throughput Concentration Response Modeling of Gene Expression for Toxicogenomics.

Front Genet. 2017-11-1

[2]
Cell cycle gene networks are associated with melanoma prognosis.

PLoS One. 2012-4-20

[3]
Balancing Type One and Two Errors in Multiple Testing for Differential Expression of Genes.

Comput Stat Data Anal. 2009-3-15

[4]
Genes and gene expression modules associated with caloric restriction and aging in the laboratory mouse.

BMC Genomics. 2009-12-7

[5]
Large datasets in biomedicine: a discussion of salient analytic issues.

J Am Med Inform Assoc. 2009-8-28

[6]
A nitty-gritty aspect of correlation and network inference from gene expression data.

Biol Direct. 2008-8-20

本文引用的文献

[1]
Capturing heterogeneity in gene expression studies by surrogate variable analysis.

PLoS Genet. 2007-9

[2]
Comments on probabilistic models behind the concept of false discovery rate.

J Bioinform Comput Biol. 2007-8

[3]
Re-sampling strategy to improve the estimation of number of null hypotheses in FDR control under strong correlation structures.

BMC Bioinformatics. 2007-5-18

[4]
How high is the level of technical noise in microarray data?

Biol Direct. 2007-4-11

[5]
Deconfounding microarray analysis - independent measurements of cell type proportions used in a regression model to resolve tissue heterogeneity bias.

Methods Inf Med. 2006

[6]
The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease.

Science. 2006-9-29

[7]
The MicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements.

Nat Biotechnol. 2006-9

[8]
Utility of correlation measures in analysis of gene expression.

NeuroRx. 2006-7

[9]
Hybridization interactions between probesets in short oligo microarrays lead to spurious correlations.

BMC Bioinformatics. 2006-6-2

[10]
A new type of stochastic dependence revealed in gene expression data.

Stat Appl Genet Mol Biol. 2006

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索