Department of Biomedical Informatics, Arizona State University, Phoenix, AZ 85004-2157, USA.
J Biomed Inform. 2010 Jun;43(3):428-34. doi: 10.1016/j.jbi.2009.11.003. Epub 2009 Nov 28.
High throughput parallel genomic sequencing (Next Generation Sequencing, NGS) shifts the bottleneck in sequencing processes from experimental data production to computationally intensive informatics-based data analysis. This manuscript introduces a biomedical informatics pipeline (BING) for the analysis of NGS data that offers several novel computational approaches to 1. image alignment, 2. signal correlation, compensation, separation, and pixel-based cluster registration, 3. signal measurement and base calling, 4. quality control and accuracy measurement. These approaches address many of the informatics challenges, including image processing, computational performance, and accuracy. These new algorithms are benchmarked against the Illumina Genome Analysis Pipeline. BING is the one of the first software tools to perform pixel-based analysis of NGS data. When compared to the Illumina informatics tool, BING's pixel-based approach produces a significant increase in the number of sequence reads, while reducing the computational time per experiment and error rate (<2%). This approach has the potential of increasing the density and throughput of NGS technologies.
高通量平行基因组测序(下一代测序,NGS)将测序过程中的瓶颈从实验数据生产转移到计算密集型基于信息学的数据分析。本文介绍了一种用于 NGS 数据分析的生物医学信息学管道(BING),它提供了几种新的计算方法,用于 1. 图像对齐,2. 信号相关、补偿、分离和基于像素的聚类注册,3. 信号测量和碱基调用,4. 质量控制和准确性测量。这些方法解决了许多信息学挑战,包括图像处理、计算性能和准确性。这些新算法与 Illumina 基因组分析管道进行了基准测试。BING 是第一个执行 NGS 数据基于像素分析的软件工具之一。与 Illumina 信息学工具相比,BING 的基于像素的方法可显著增加序列读取的数量,同时减少每个实验的计算时间和错误率(<2%)。这种方法有可能增加 NGS 技术的密度和吞吐量。