Suppr超能文献

一种用于整合Affymetrix基因芯片数据中“兄弟”探针集的统计框架。

A statistical framework for consolidating "sibling" probe sets for Affymetrix GeneChip data.

作者信息

Li Hua, Zhu Dongxiao, Cook Malcolm

机构信息

Bioinformatics Center, Stowers Institute for Medical Research, 1000 E 50th St, Kansas City, MO 64110, USA.

出版信息

BMC Genomics. 2008 Apr 24;9:188. doi: 10.1186/1471-2164-9-188.

Abstract

BACKGROUND

Affymetrix GeneChip typically contains multiple probe sets per gene, defined as sibling probe sets in this study. These probe sets may or may not behave similar across treatments. The most appropriate way of consolidating sibling probe sets suitable for analysis is an open problem. We propose the Analysis of Variance (ANOVA) framework to decide which sibling probe sets can be consolidated.

RESULTS

The ANOVA model allows us to separate the sibling probe sets into two types: those behave similarly across treatments and those behave differently across treatments. We found that consolidation of sibling probe sets of the former type results in large increase in the number of differentially expressed genes under various statistical criteria. The approach to selecting sibling probe sets suitable for consolidating is implemented in R language and freely available from http://research.stowers-institute.org/hul/affy/.

CONCLUSION

Our ANOVA analysis of sibling probe sets provides a statistical framework for selecting sibling probe sets for consolidation. Consolidating sibling probe sets by pooling data from each greatly improves the estimates of a gene expression level and results in identification of more biologically relevant genes. Sibling probe sets that do not qualify for consolidation may represent annotation errors or other artifacts, or may correspond to differentially processed transcripts of the same gene that require further analysis.

摘要

背景

Affymetrix基因芯片通常每个基因包含多个探针集,在本研究中定义为同胞探针集。这些探针集在不同处理下的表现可能相似,也可能不同。整合适合分析的同胞探针集的最合适方法是一个尚未解决的问题。我们提出方差分析(ANOVA)框架来确定哪些同胞探针集可以整合。

结果

方差分析模型使我们能够将同胞探针集分为两类:在不同处理下表现相似的探针集和在不同处理下表现不同的探针集。我们发现,整合前一类的同胞探针集会导致在各种统计标准下差异表达基因的数量大幅增加。选择适合整合的同胞探针集的方法用R语言实现,可从http://research.stowers-institute.org/hul/affy/免费获取。

结论

我们对同胞探针集的方差分析为选择用于整合的同胞探针集提供了一个统计框架。通过汇总每个探针集的数据来整合同胞探针集,极大地提高了基因表达水平的估计,并导致鉴定出更多具有生物学相关性的基因。不符合整合条件的同胞探针集可能代表注释错误或其他假象,或者可能对应于同一基因的差异加工转录本,需要进一步分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bdf1/2397416/1fe42aa55706/1471-2164-9-188-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验