Yuryev Anton, Huang JianPing, Pohl Mark, Patch Robert, Watson Felicia, Bell Peter, Donaldson Miriam, Phillips Michael S, Boyce-Jacino Michael T
Orchid Biosciences, Orchid Life Sciences, 303 East College Road, Princeton, NJ 08540, USA.
Nucleic Acids Res. 2002 Dec 1;30(23):e131. doi: 10.1093/nar/gnf131.
Using an empirical panel of more than 20 000 single base primer extension (SNP-IT) assays we have developed a set of statistical scores for evaluating and rank ordering various parameters of the SNP-IT reaction to facilitate high-throughput assay primer design with improved likelihood of success. Each score predicts either signal magnitude from primer extension or signal noise caused by mispriming of primers and structure of the PCR product. All scores have been shown to correlate with the success/failure rate of the SNP-IT reaction, based on analysis of assay results. A logistic regression analysis was applied to combine all scored parameters into one measure predicting the overall success/failure rate of a given SNP marker. Three training sets for different types of SNP-IT reaction, each containing about 22 000 SNP markers, were used to assign weights to each score and optimize the prediction of the combined measure. c-Statistics of 0.69, 0.77 and 0.72 were achieved for three training sets. This new statistical prediction can be used to improve primer design for the SNP-IT reaction and evaluate the probability of genotyping success for a given SNP based on analysis of the surrounding genomic sequence.
我们使用了一个包含20000多个单碱基引物延伸(SNP-IT)检测的经验性面板,开发了一组统计分数,用于评估SNP-IT反应的各种参数并进行排名,以促进高通量检测引物设计,提高成功的可能性。每个分数要么预测引物延伸产生的信号强度,要么预测引物错配和PCR产物结构导致的信号噪声。基于检测结果分析,所有分数均已证明与SNP-IT反应的成功/失败率相关。应用逻辑回归分析将所有评分参数合并为一个指标,以预测给定SNP标记的总体成功/失败率。针对不同类型的SNP-IT反应,使用了三个训练集,每个训练集包含约22000个SNP标记,用于为每个分数赋予权重并优化组合指标的预测。三个训练集的c统计量分别为0.69、0.77和0.72。这种新的统计预测可用于改进SNP-IT反应的引物设计,并基于对周围基因组序列的分析评估给定SNP基因分型成功的概率。