Suppr超能文献

通过计算机模拟方法检测高密度寡核苷酸阵列中的假表达信号。

Detecting false expression signals in high-density oligonucleotide arrays by an in silico approach.

作者信息

Zhang Jinghui, Finney Richard P, Clifford Robert J, Derr Leslie K, Buetow Kenneth H

机构信息

Laboratory of Population Genetics, National Cancer Institute/National Institutes of Health, 8424 Helgerman Court, Room 101, MSC 8302, Bethesda, MD 20892-8302, USA.

出版信息

Genomics. 2005 Mar;85(3):297-308. doi: 10.1016/j.ygeno.2004.11.004.

Abstract

High-density oligonucleotide arrays have become a popular assay for concurrent measurement of mRNA expression at the genome scale. Much effort has been devoted to the development of statistical analysis tools aimed at reducing experimental noise and normalizing experimental variation in gene expression analysis. However, these investigations do not detect or catalog systematic problems associated with specific oligonucleotide probes. Here, we present an investigation of problematic probes that yield consistent but inaccurate signals across multiple experiments. By evaluating data integrity among gene, probe sequence, and genomic structure we identified a total of 20,696 (10.5%) nonspecific probes that could cross-hybridize to multiple genes and a total of 18,363 (9.3%) probes that miss the target transcript sequences on the Affymetrix GeneChip U95A/Av2 array. The numbers of nonspecific and mistargeted probes on the U133A array are 29,405 (12.1%) and 19,717 (8.0%), respectively. The poor performance of the mistargeted probes was confirmed in two GeneChip experiments, in which these probes showed a 20-30% decrease in detecting present signals compared with normal probes. Comparison of qualitative expression signals obtained from SAGE and EST data with those from GeneChip arrays showed that the consistency of the two platforms is 30% lower in problematic probes than in normal probes. A Web application was developed to apply our results for improving the accuracy of expression analysis.

摘要

高密度寡核苷酸阵列已成为在基因组规模上同时测量mRNA表达的一种常用检测方法。人们投入了大量精力来开发统计分析工具,旨在减少实验噪声并在基因表达分析中对实验变异进行标准化。然而,这些研究并未检测到或编目与特定寡核苷酸探针相关的系统性问题。在此,我们展示了一项针对有问题探针的研究,这些探针在多个实验中产生一致但不准确的信号。通过评估基因、探针序列和基因组结构之间的数据完整性,我们总共鉴定出20,696个(10.5%)可与多个基因交叉杂交的非特异性探针,以及总共18,363个(9.3%)在Affymetrix GeneChip U95A/Av2阵列上错过目标转录本序列的探针。U133A阵列上非特异性和靶向错误的探针数量分别为29,405个(12.1%)和19,717个(8.0%)。在两项基因芯片实验中证实了靶向错误的探针性能不佳,在这些实验中,与正常探针相比,这些探针在检测现有信号时降低了20 - 30%。将从SAGE和EST数据获得的定性表达信号与从基因芯片阵列获得的信号进行比较表明,在有问题的探针中,两个平台的一致性比正常探针低30%。开发了一个网络应用程序来应用我们的结果,以提高表达分析的准确性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验