Handley Daniel, Serban Nicoleta, Peters David, O'Doherty Robert, Field Melvin, Wasserman Larry, Spirtes Peter, Scheines Richard, Glymour Clark
Department of Statistics, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
Genomics. 2004 Jun;83(6):1169-75. doi: 10.1016/j.ygeno.2003.12.010.
We present evidence of a potentially serious source of error intrinsic to all spotted cDNA microarrays that use IMAGE clones of expressed sequence tags (ESTs). We found that a high proportion of these EST sequences contain 5'-end poly(dT) sequences that are remnants from the oligo(dT)-primed reverse transcription of polyadenylated mRNA templates used to generate EST cDNA for sequence clone libraries. Analysis of expression data from two single-dye cDNA microarray experiments showed that ESTs whose sequences contain repeats of consecutive 5'-end dT residues appeared to be strongly coexpressed, while expression data of all other sequences exhibited no such pattern. Our analysis suggests that expression data from sequences containing 5' poly(dT) tracts are more likely to be due to systematic cross-hybridization of these poly(dT) tracts than to true mRNA coexpression. This indicates that existing data generated by cDNA microarrays containing IMAGE clone ESTs should be filtered to remove expression data containing significant 5' poly(dT) tracts.
我们提供证据表明,所有使用表达序列标签(EST)的IMAGE克隆的斑点cDNA微阵列都存在一个潜在的严重误差来源。我们发现,这些EST序列中有很大一部分包含5'-末端聚(dT)序列,这些序列是用于生成EST cDNA序列克隆文库的多聚腺苷酸化mRNA模板的寡聚(dT)引发的逆转录的残余物。对两个单染料cDNA微阵列实验的表达数据进行分析表明,其序列包含连续5'-末端dT残基重复的EST似乎强烈共表达,而所有其他序列的表达数据则没有这种模式。我们的分析表明,包含5'聚(dT)片段的序列的表达数据更可能是由于这些聚(dT)片段的系统性交叉杂交,而不是真正的mRNA共表达。这表明,应过滤包含IMAGE克隆EST的cDNA微阵列产生的现有数据,以去除包含大量5'聚(dT)片段的表达数据。