Cumbaa Christian A, Jurisica Igor
Division of Signaling Biology, Ontario Cancer Institute, University Health Network, Toronto Medical Discovery Tower, 9-305, 101 College Street, Toronto, ON, M5G 1L7, Canada.
J Struct Funct Genomics. 2010 Mar;11(1):61-9. doi: 10.1007/s10969-009-9076-9. Epub 2010 Jan 14.
We have developed an image-analysis and classification system for automatically scoring images from high-throughput protein crystallization trials. Image analysis for this system is performed by the Help Conquer Cancer (HCC) project on the World Community Grid. HCC calculates 12,375 distinct image features on microbatch-under-oil images from the Hauptman-Woodward Medical Research Institute's High-Throughput Screening Laboratory. Using HCC-computed image features and a massive training set of 165,351 hand-scored images, we have trained multiple Random Forest classifiers that accurately recognize multiple crystallization outcomes, including crystals, clear drops, precipitate, and others. The system successfully recognizes 80% of crystal-bearing images, 89% of precipitate images, and 98% of clear drops.
我们开发了一种图像分析和分类系统,用于自动对高通量蛋白质结晶试验的图像进行评分。该系统的图像分析由世界社区网格上的“助力攻克癌症”(HCC)项目执行。HCC从豪普特曼-伍德沃德医学研究所的高通量筛选实验室的油下微量批次图像中计算出12375个不同的图像特征。利用HCC计算出的图像特征和一个包含165351张人工评分图像的大规模训练集,我们训练了多个随机森林分类器,这些分类器能够准确识别多种结晶结果,包括晶体、清澈液滴、沉淀及其他。该系统成功识别出80%的含晶体图像、89%的沉淀图像和98%的清澈液滴图像。