Huang Ching Yu Austin, Studebaker Joel, Yuryev Anton, Huang Jianping, Scott Kathryn E, Kuebler Jennifer, Varde Shobha, Alfisi Steven, Gelfand Craig A, Pohl Mark, Boyce-Jacino Michael T
Center for Pharmacogenomics and Complex Disease Research, Newark, NJ 07101, USA.
BMC Bioinformatics. 2004 Apr 2;5:36. doi: 10.1186/1471-2105-5-36.
SNP genotyping typically incorporates a review step to ensure that the genotype calls for a particular SNP are correct. For high-throughput genotyping, such as that provided by the GenomeLab SNPstream instrument from Beckman Coulter, Inc., the manual review used for low-volume genotyping becomes a major bottleneck. The work reported here describes the application of a neural network to automate the review of results.
We describe an approach to reviewing the quality of primer extension 2-color fluorescent reactions by clustering optical signals obtained from multiple samples and a single reaction set-up. The method evaluates the quality of the signal clusters from the genotyping results. We developed 64 scores to measure the geometry and position of the signal clusters. The expected signal distribution was represented by a distribution of a 64-component parametric vector obtained by training the two-layer neural network onto a set of 10,968 manually reviewed 2D plots containing the signal clusters.
The neural network approach described in this paper may be used with results from the GenomeLab SNPstream instrument for high-throughput SNP genotyping. The overall correlation with manual revision was 0.844. The approach can be applied to a quality review of results from other high-throughput fluorescent-based biochemical assays in a high-throughput mode.
单核苷酸多态性(SNP)基因分型通常包含一个审核步骤,以确保特定SNP的基因型判定正确。对于高通量基因分型,如贝克曼库尔特公司的GenomeLab SNPstream仪器所提供的那样,用于低通量基因分型的人工审核成为一个主要瓶颈。本文报道的工作描述了应用神经网络来自动审核结果。
我们描述了一种通过对从多个样本和单个反应设置中获得的光学信号进行聚类,来审核引物延伸双色荧光反应质量的方法。该方法评估基因分型结果中信号聚类的质量。我们开发了64个分数来测量信号聚类的几何形状和位置。预期信号分布由通过在一组包含信号聚类的10968个经人工审核的二维图上训练两层神经网络而获得的64分量参数向量的分布来表示。
本文所述的神经网络方法可用于GenomeLab SNPstream仪器的高通量SNP基因分型结果。与人工审核的总体相关性为0.844。该方法可在高通量模式下应用于其他基于荧光的高通量生化分析结果的质量审核。