Kumari Sunita, Verma Lalit K, Weller Jennifer W
Department of Computer and Information Science, Indiana University Purdue University Indianapolis, Indianapolis, IN, 46202, USA.
BMC Bioinformatics. 2007 Jul 30;8:276. doi: 10.1186/1471-2105-8-276.
Affymetrix gene expression arrays incorporate paired perfect match (PM) and mismatch (MM) probes to distinguish true signals from those arising from cross-hybridization events. A MM signal often shows greater intensity than a PM signal; we propose that one underlying cause is the presence of allelic variants arising from single nucleotide polymorphisms (SNPs). To annotate and characterize SNP contributions to anomalous probe binding behavior we have developed a software tool called AffyMAPSDetector.
AffyMAPSDetector can be used to describe any Affymetrix expression GeneChip with respect to SNPs. When AffyMAPSDetector was run on GeneChip HG-U95Av2 against dbSNP-build-123, we found 7286 probes (belonging to 2,582 probesets) containing SNPs, out of which 325 probes contained at least one SNP at position 13. Against dbSNP-build-126, 8758 probes (belonging to 3,002 probesets) contained SNPs, of which 409 probes contained at least one SNP at position 13. Therefore, depending on the expressed allele, the MM probe can sometimes be the transcript complement. This information was used to characterize probe measurements reported in a published, well-replicated lung adenocarcinoma study. The total intensity distributions showed that the SNP-containing probes had a larger negative mean intensity difference (PM-MM) and greater range of the difference than did probes without SNPs. In the sample replicates, SNP-containing probes with reproducible intensity ratios were identified, allowing selection of SNP probesets that yielded unique sample signatures. At the gene expression level, use of the (MM-PM) value for SNP-containing probes resulted in different Presence/Absence calls for some genes. Such a change in status of the genes has the clear potential for influencing downstream clustering and classification results.
Output from this tool characterizes SNP-containing probes on GeneChip microarrays, thus improving our understanding of factors contributing to expression measurements. The pattern of SNP binding examined so far indicates distinct behavior of the SNP-containing probes and has the potential to help us identify new SNPs. Knowing which probes contain SNPs provides flexibility in determining whether to include or exclude them from gene-expression intensity calculations; selected sets of SNP-containing probes produce sample-unique signatures. AffyMAPSDetector information is available at http://www.binf.gmu.edu/weller/BMC_bioinformatics/AffyMapsDetector/index.html.
Affymetrix基因表达芯片采用配对的完全匹配(PM)和错配(MM)探针,以区分真实信号与交叉杂交事件产生的信号。MM信号的强度通常高于PM信号;我们认为一个潜在原因是单核苷酸多态性(SNP)产生的等位基因变体的存在。为了注释和表征SNP对异常探针结合行为的贡献,我们开发了一种名为AffyMAPSDetector的软件工具。
AffyMAPSDetector可用于描述任何Affymetrix表达基因芯片的SNP情况。当针对dbSNP-build-123在HG-U95Av2基因芯片上运行AffyMAPSDetector时,我们发现7286个探针(属于2582个探针集)含有SNP,其中325个探针在第13位至少含有一个SNP。针对dbSNP-build-126,8758个探针(属于3002个探针集)含有SNP,其中409个探针在第13位至少含有一个SNP。因此,根据表达的等位基因,MM探针有时可能是转录本互补序列。该信息用于表征一项已发表的、经过充分重复验证的肺腺癌研究中报告的探针测量结果。总强度分布表明,含SNP的探针比不含SNP的探针具有更大的负平均强度差(PM-MM)和更大的差值范围。在样本重复中,鉴定出了具有可重复强度比的含SNP探针,从而可以选择产生独特样本特征的SNP探针集。在基因表达水平上,对含SNP的探针使用(MM-PM)值会导致某些基因的存在/缺失判断不同。基因状态的这种变化显然有可能影响下游的聚类和分类结果。
该工具的输出表征了基因芯片微阵列上含SNP的探针,从而增进了我们对影响表达测量因素的理解。目前所检测的SNP结合模式表明含SNP探针具有独特行为,并有潜力帮助我们识别新的SNP。知道哪些探针含有SNP,在决定是否将其纳入或排除在基因表达强度计算中提供了灵活性;选定的含SNP探针集可产生样本独特的特征。可通过http://www.binf.gmu.edu/weller/BMC_bioinformatics/AffyMapsDetector/index.html获取AffyMAPSDetector的信息。