Binder Hans, Preibisch Stephan, Kirsten Toralf
Interdisciplinary Centre for Bioinformatics, University of Leipzig, Haertelstrasse 16-18, D-04107 Leipzig, Germany.
Langmuir. 2005 Sep 27;21(20):9287-302. doi: 10.1021/la051231s.
The microarray technology enables the expression degree of thousands of genes to be estimated at once by the measurement of the abundance of the respective messenger RNA. This method is based on the sequence specific binding of RNA to DNA probes and its detection using fluorescent labels. The raw intensity data are affected by the sequence-specific affinity of probe and RNA for duplex formation, by the background intensity due to nonspecific hybridization at small transcript concentrations and by the saturation of the probes at high transcript concentration owing to surface adsorption. We address these issues using a binding model which describes specific and nonspecific hybridization in terms of a competitive two-species Langmuir isotherm and DNA/RNA duplex formation in terms of sequence-specific, single-base related interactions. The GeneChip microarrays technology uses pairs of so-called perfect match (PM) and mismatch (MM) oligonucleotide probes to estimate the amount of nonspecific hybridization. The mean affinity of the probes decrease according to PM(specific) > MM(specific) >> PM(nonspecific) approximately MM(nonspecific). The stability of specific and nonspecific DNA/RNA duplexes is mainly determined by Watson Crick (WC) pairings. Mismatched self-complementary pairings in the middle of the MM sequence only weakly contribute to the duplex stability. The asymmetry of base pair interaction in the DNA/RNA hybrid duplexes gives rise to a duplet-like symmetry of the PM - MM intensity difference at dominating nonspecific hybridization and a triplet-like symmetry at specific hybridization. The signal intensities of the PM and MM probes and their difference are assessed in terms of sensitivity and specificity. The presented results imply the refinement of existing algorithms of probe level analysis to correct microarray data for nonspecific background intensities and saturation on the basis of the probe sequence.
微阵列技术能够通过测量各自信使核糖核酸的丰度一次性估计数千个基因的表达程度。该方法基于RNA与DNA探针的序列特异性结合及其使用荧光标记的检测。原始强度数据受到探针与RNA形成双链体的序列特异性亲和力、小转录本浓度下非特异性杂交产生的背景强度以及高转录本浓度下由于表面吸附导致的探针饱和的影响。我们使用一种结合模型来解决这些问题,该模型用竞争性双物种朗缪尔等温线描述特异性和非特异性杂交,并用序列特异性、单碱基相关相互作用描述DNA/RNA双链体的形成。基因芯片微阵列技术使用所谓的完全匹配(PM)和错配(MM)寡核苷酸探针来估计非特异性杂交的量。探针的平均亲和力按照PM(特异性)>MM(特异性)>>PM(非特异性)≈MM(非特异性)的顺序降低。特异性和非特异性DNA/RNA双链体的稳定性主要由沃森-克里克(WC)配对决定。MM序列中间的错配自互补配对对双链体稳定性的贡献很小。DNA/RNA杂交双链体中碱基对相互作用的不对称性在占主导的非特异性杂交时导致PM-MM强度差异呈现出类似二重态的对称性,在特异性杂交时呈现出类似三重态的对称性。根据灵敏度和特异性评估PM和MM探针的信号强度及其差异。所呈现的结果意味着对现有探针水平分析算法进行改进,以便根据探针序列校正微阵列数据的非特异性背景强度和饱和度。