Langdon Wb, Harrison Ap
Department of Computer Science, King's College London, Strand, London, WC2R 2LS, UK.
Algorithms Mol Biol. 2009 Mar 19;4:6. doi: 10.1186/1748-7188-4-6.
Affymetrix High Density Oligonuclotide Arrays (HDONA) simultaneously measure expression of thousands of genes using millions of probes. We use correlations between measurements for the same gene across 6685 human tissue samples from NCBI's GEO database to indicated the quality of individual HG-U133A probes. Low correlation indicates a poor probe.
Regular expressions can be automatically created from a Backus-Naur form (BNF) context-free grammar using strongly typed genetic programming.
The automatically produced motif is better at predicting poor DNA sequences than an existing human generated RE, suggesting runs of Cytosine and Guanine and mixtures should all be avoided.
Affymetrix高密度寡核苷酸阵列(HDONA)使用数百万个探针同时测量数千个基因的表达。我们利用来自NCBI的GEO数据库的6685个人类组织样本中同一基因测量值之间的相关性来表明单个HG-U133A探针的质量。低相关性表明探针质量差。
正则表达式可以使用强类型遗传编程从巴科斯范式(BNF)上下文无关语法自动创建。
自动生成的基序在预测不良DNA序列方面比现有的人工生成的RE更好,这表明应避免胞嘧啶和鸟嘌呤的连续排列以及混合物。