Royce Thomas E, Rozowsky Joel S, Gerstein Mark B
Interdepartmental Program in Computational Biology and Bioinformatics, Yale University, USA.
Nucleic Acids Res. 2007;35(15):e99. doi: 10.1093/nar/gkm549. Epub 2007 Aug 7.
A generic DNA microarray design applicable to any species would greatly benefit comparative genomics. We have addressed the feasibility of such a design by leveraging the great feature densities and relatively unbiased nature of genomic tiling microarrays. Specifically, we first divided each Homo sapiens Refseq-derived gene's spliced nucleotide sequence into all of its possible contiguous 25 nt subsequences. For each of these 25 nt subsequences, we searched a recent human transcript mapping experiment's probe design for the 25 nt probe sequence having the fewest mismatches with the subsequence, but that did not match the subsequence exactly. Signal intensities measured with each gene's nearest-neighbor features were subsequently averaged to predict their gene expression levels in each of the experiment's thirty-three hybridizations. We examined the fidelity of this approach in terms of both sensitivity and specificity for detecting actively transcribed genes, for transcriptional consistency between exons of the same gene, and for reproducibility between tiling array designs. Taken together, our results provide proof-of-principle for probing nucleic acid targets with off-target, nearest-neighbor features.
一种适用于任何物种的通用DNA微阵列设计将极大地促进比较基因组学的发展。我们通过利用基因组平铺微阵列的高特征密度和相对无偏性来探讨这种设计的可行性。具体来说,我们首先将每个源自智人Refseq的基因的剪接核苷酸序列划分为其所有可能的连续25个核苷酸的子序列。对于这些25个核苷酸的子序列中的每一个,我们在最近的人类转录本图谱实验的探针设计中搜索与该子序列错配最少但不完全匹配的25个核苷酸的探针序列。随后,将用每个基因的最近邻特征测量的信号强度进行平均,以预测它们在实验的33次杂交中的每一次中的基因表达水平。我们从检测活跃转录基因的敏感性和特异性、同一基因外显子之间的转录一致性以及平铺阵列设计之间的可重复性方面检验了这种方法的保真度。综合来看,我们的结果为利用脱靶的最近邻特征探测核酸靶标提供了原理证明。