Yin Zheng, Zhou Xiaobo, Sun Youxian, Wong Stephen T C
State Key Laboratory of Industrial Control Technology, Zhejiang University, 38 Zheda Road, Hangzhou, Zhejiang Province 310027, China.
Pattern Recognit. 2009 Apr;42(4):509-522. doi: 10.1016/j.patcog.2008.09.032.
Identifying and validating novel phenotypes from images inputting online is a major challenge against high-content RNA interference (RNAi) screening. Newly discovered phenotypes should be visually distinct from existing ones and make biological sense. An online phenotype discovery method featuring adaptive phenotype modeling and iterative cluster merging using improved gap statistics is proposed. Clustering results based on compactness criteria and Gaussian mixture models (GMM) for existing phenotypes iteratively modify each other by multiple hypothesis test and model optimization based on minimum classification error (MCE). The method works well on discovering new phenotypes adaptively when applied to both of synthetic datasets and RNAi high content screen (HCS) images with ground truth labels.
从在线输入的图像中识别和验证新的表型是高通量RNA干扰(RNAi)筛选面临的一项重大挑战。新发现的表型应在视觉上与现有表型不同且具有生物学意义。本文提出了一种基于自适应表型建模和使用改进间隙统计量的迭代聚类合并的在线表型发现方法。基于紧凑性标准和高斯混合模型(GMM)对现有表型进行聚类,通过多重假设检验和基于最小分类误差(MCE)的模型优化,使聚类结果相互迭代修正。该方法在应用于合成数据集和带有真实标签的RNAi高内涵筛选(HCS)图像时,均能很好地自适应发现新表型。