Zhou Jie, Peng Hanchuan
Department of Computer Science, Northern Illinois University, DeKalb, IL 60115, USA.
Bioinformatics. 2007 Mar 1;23(5):589-96. doi: 10.1093/bioinformatics/btl680. Epub 2007 Jan 19.
Gene expression patterns obtained by in situ mRNA hybridization provide important information about different genes during Drosophila embryogenesis. So far, annotations of these images are done by manually assigning a subset of anatomy ontology terms to an image. This time-consuming process depends heavily on the consistency of experts.
We develop a system to automatically annotate a fruitfly's embryonic tissue in which a gene has expression. We formulate the task as an image pattern recognition problem. For a new fly embryo image, our system answers two questions: (1) Which stage range does an image belong to? (2) Which annotations should be assigned to an image? We propose to identify the wavelet embryo features by multi-resolution 2D wavelet discrete transform, followed by min-redundancy max-relevance feature selection, which yields optimal distinguishing features for an annotation. We then construct a series of parallel bi-class predictors to solve the multi-objective annotation problem since each image may correspond to multiple annotations.
The complete annotation prediction results are available at: http://www.cs.niu.edu/~jzhou/papers/fruitfly and http://research.janelia.org/peng/proj/fly_embryo_annotation/. The datasets used in experiments will be available upon request to the correspondence author.
通过原位mRNA杂交获得的基因表达模式为果蝇胚胎发育过程中的不同基因提供了重要信息。到目前为止,这些图像的注释是通过手动为图像分配解剖学本体术语的一个子集来完成的。这个耗时的过程严重依赖于专家的一致性。
我们开发了一个系统来自动注释果蝇胚胎中某个基因有表达的组织。我们将该任务表述为一个图像模式识别问题。对于一张新的果蝇胚胎图像,我们的系统回答两个问题:(1)图像属于哪个阶段范围?(2)应该为图像分配哪些注释?我们建议通过多分辨率二维小波离散变换识别小波胚胎特征,然后进行最小冗余最大相关性特征选择,从而为注释产生最佳区分特征。然后,我们构建一系列并行的二类预测器来解决多目标注释问题,因为每个图像可能对应多个注释。
完整的注释预测结果可在以下网址获取:http://www.cs.niu.edu/~jzhou/papers/fruitfly和http://research.janelia.org/peng/proj/fly_embryo_annotation/。实验中使用的数据集可根据通讯作者的要求提供。