Kuo Winston Patrick, Jenssen Tor-Kristian, Butte Atul J, Ohno-Machado Lucila, Kohane Isaac S
Children's Hospital Informatics Program and Division of Endocrinology, Department of Medicine, Children's Hospital, Brigham and Women's Hospital, Harvard Medical School, Boston, MA 02115, USA.
Bioinformatics. 2002 Mar;18(3):405-12. doi: 10.1093/bioinformatics/18.3.405.
[corrected] The existence of several technologies for measuring gene expression makes the question of cross-technology agreement of measurements an important issue. Cross-platform utilization of data from different technologies has the potential to reduce the need to duplicate experiments but requires corresponding measurements to be comparable.
A comparison of mRNA measurements of 2895 sequence-matched genes in 56 cell lines from the standard panel of 60 cancer cell lines from the National Cancer Institute (NCI 60) was carried out by calculating correlation between matched measurements and calculating concordance between cluster from two high-throughput DNA microarray technologies, Stanford type cDNA microarrays and Affymetrix oligonucleotide microarrays.
In general, corresponding measurements from the two platforms showed poor correlation. Clusters of genes and cell lines were discordant between the two technologies, suggesting that relative intra-technology relationships were not preserved. GC-content, sequence length, average signal intensity, and an estimator of cross-hybridization were found to be associated with the degree of correlation. This suggests gene-specific, or more correctly probe-specific, factors influencing measurements differently in the two platforms, implying a poor prognosis for a broad utilization of gene expression measurements across platforms.
[已修正] 多种测量基因表达的技术的存在使得测量的跨技术一致性问题成为一个重要问题。跨平台利用来自不同技术的数据有可能减少重复实验的需求,但要求相应的测量具有可比性。
通过计算匹配测量之间的相关性以及计算来自两种高通量DNA微阵列技术(斯坦福型cDNA微阵列和Affymetrix寡核苷酸微阵列)的聚类之间的一致性,对来自美国国立癌症研究所(NCI 60)的60种癌细胞系标准面板中的56种细胞系中的2895个序列匹配基因的mRNA测量进行了比较。
总体而言,两个平台的相应测量显示出较差的相关性。两种技术之间基因和细胞系的聚类不一致,这表明技术内的相对关系没有得到保留。发现GC含量、序列长度、平均信号强度和交叉杂交估计值与相关程度有关。这表明基因特异性或更确切地说是探针特异性因素在两个平台中对测量的影响不同,这意味着跨平台广泛利用基因表达测量的预后不佳。