Mattes William B
Investigative Toxicology, Pfizer Inc, Kalamazoo, Michigan, USA.
Environ Health Perspect. 2004 Mar;112(4):506-10. doi: 10.1289/ehp.6698.
On the surface, transcript profiling using microarrays seems to offer a way of looking at the global response of the cell to perturbation, with a focus on changes in gene expression. The difficulty, however, is that the response of a particular gene is actually measured on the array by an element that is a short, defined nucleic acid sequence. Sequences that map back to the same genetic locus may actually be given different names and descriptions when they are deposited in public sequence databases; when such sequences are used in microarray construction, elements that monitor the same genetic locus may have different names and descriptions. The algorithm described here uses a hierarchical approach to assign a single best annotation to the elements in a given microarray in such a fashion that elements from one microarray platform may be cross-indexed with those of another. The algorithm relies on the nucleic acid accession number for a given array element, and uses that to retrieve annotation from the most recent versions of LocusLink and UniGene. Both database resources are searched, with a priority being given to annotation derived from the curated LocusLink database. In lieu of annotation found in these databases, the default GenBank annotation is used. As a final outcome, a cross-chip identifier is generated that may be used to cross-index array elements. The program is available as a practical extraction and report language (Perl) script that can run under any Perl interpreter.
从表面上看,使用微阵列进行转录本分析似乎提供了一种观察细胞对扰动的整体反应的方法,重点是基因表达的变化。然而,困难在于,特定基因的反应实际上是通过阵列上一个短的、确定的核酸序列元件来测量的。映射回同一基因座的序列在存入公共序列数据库时可能会被赋予不同的名称和描述;当这些序列用于微阵列构建时,监测同一基因座的元件可能会有不同的名称和描述。这里描述的算法采用分层方法为给定微阵列中的元件分配单一的最佳注释,以便一个微阵列平台的元件可以与另一个平台的元件交叉索引。该算法依赖于给定阵列元件的核酸登录号,并利用它从最新版本的LocusLink和UniGene中检索注释。会同时搜索这两个数据库资源,优先使用来自经过整理的LocusLink数据库的注释。如果在这些数据库中未找到注释,则使用默认的GenBank注释。最终结果是生成一个跨芯片标识符,可用于交叉索引阵列元件。该程序以实用提取和报告语言(Perl)脚本的形式提供,可以在任何Perl解释器下运行。