Suppr超能文献

如何做出决定?从短寡核苷酸阵列数据计算基因表达的不同方法会得出不同的结果。

How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results.

作者信息

Millenaar Frank F, Okyere John, May Sean T, van Zanten Martijn, Voesenek Laurentius A C J, Peeters Anton J M

机构信息

Plant Ecophysiology, Institute of Environmental Biology, Faculty of Science, Utrecht University, Sorbonnelaan 16, 3584 CA Utrecht, The Netherlands.

出版信息

BMC Bioinformatics. 2006 Mar 15;7:137. doi: 10.1186/1471-2105-7-137.

Abstract

BACKGROUND

Short oligonucleotide arrays for transcript profiling have been available for several years. Generally, raw data from these arrays are analysed with the aid of the Microarray Analysis Suite or GeneChip Operating Software (MAS or GCOS) from Affymetrix. Recently, more methods to analyse the raw data have become available. Ideally all these methods should come up with more or less the same results. We set out to evaluate the different methods and include work on our own data set, in order to test which method gives the most reliable results.

RESULTS

Calculating gene expression with 6 different algorithms (MAS5, dChip PMMM, dChip PM, RMA, GC-RMA and PDNN) using the same (Arabidopsis) data, results in different calculated gene expression levels. Consequently, depending on the method used, different genes will be identified as differentially regulated. Surprisingly, there was only 27 to 36% overlap between the different methods. Furthermore, 47.5% of the genes/probe sets showed good correlation between the mismatch and perfect match intensities.

CONCLUSION

After comparing six algorithms, RMA gave the most reproducible results and showed the highest correlation coefficients with Real Time RT-PCR data on genes identified as differentially expressed by all methods. However, we were not able to verify, by Real Time RT-PCR, the microarray results for most genes that were solely calculated by RMA. Furthermore, we conclude that subtraction of the mismatch intensity from the perfect match intensity results most likely in a significant underestimation for at least 47.5% of the expression values. Not one algorithm produced significant expression values for genes present in quantities below 1 pmol. If the only purpose of the microarray experiment is to find new candidate genes, and too many genes are found, then mutual exclusion of the genes predicted by contrasting methods can be used to narrow down the list of new candidate genes by 64 to 73%.

摘要

背景

用于转录谱分析的短寡核苷酸阵列已经问世数年。通常,这些阵列的原始数据借助Affymetrix的微阵列分析套件或基因芯片操作软件(MAS或GCOS)进行分析。最近,有了更多分析原始数据的方法。理想情况下,所有这些方法应该得出大致相同的结果。我们着手评估不同的方法,并纳入对我们自己数据集的研究,以测试哪种方法能给出最可靠的结果。

结果

使用相同的(拟南芥)数据,用6种不同算法(MAS5、dChip PMMM、dChip PM、RMA、GC - RMA和PDNN)计算基因表达,得出不同的计算基因表达水平。因此,根据所使用的方法,会鉴定出不同的差异调节基因。令人惊讶的是,不同方法之间只有27%至36%的重叠。此外,47.5%的基因/探针集在错配和完全匹配强度之间显示出良好的相关性。

结论

在比较六种算法后,RMA给出了最可重复的结果,并且与所有方法鉴定为差异表达的基因的实时RT - PCR数据显示出最高的相关系数。然而,我们无法通过实时RT - PCR验证大多数仅由RMA计算的基因的微阵列结果。此外,我们得出结论,从完全匹配强度中减去错配强度很可能导致至少47.5%的表达值被显著低估。没有一种算法能为含量低于1皮摩尔的基因产生显著的表达值。如果微阵列实验的唯一目的是寻找新的候选基因,并且发现了太多基因,那么通过对比方法预测的基因相互排除可用于将新候选基因列表缩小64%至73%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bd73/1431565/1a8724c1491e/1471-2105-7-137-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验