Boland M V, Murphy R F
Center for Light Microscope Imaging and Biotechnology, Biomedical and Health Engineering Program, Carnegie Mellon University, 4400 Fifth Ave., Pittsburgh, PA 15213, USA.
Bioinformatics. 2001 Dec;17(12):1213-23. doi: 10.1093/bioinformatics/17.12.1213.
Assessment of protein subcellular location is crucial to proteomics efforts since localization information provides a context for a protein's sequence, structure, and function. The work described below is the first to address the subcellular localization of proteins in a quantitative, comprehensive manner.
Images for ten different subcellular patterns (including all major organelles) were collected using fluorescence microscopy. The patterns were described using a variety of numeric features, including Zernike moments, Haralick texture features, and a set of new features developed specifically for this purpose. To test the usefulness of these features, they were used to train a neural network classifier. The classifier was able to correctly recognize an average of 83% of previously unseen cells showing one of the ten patterns. The same classifier was then used to recognize previously unseen sets of homogeneously prepared cells with 98% accuracy.
Algorithms were implemented using the commercial products Matlab, S-Plus, and SAS, as well as some functions written in C. The scripts and source code generated for this work are available at http://murphylab.web.cmu.edu/software.
蛋白质亚细胞定位的评估对于蛋白质组学研究至关重要,因为定位信息为蛋白质的序列、结构和功能提供了背景。以下所述工作首次以定量、全面的方式研究蛋白质的亚细胞定位。
使用荧光显微镜收集了十种不同亚细胞模式(包括所有主要细胞器)的图像。这些模式通过多种数值特征进行描述,包括泽尼克矩、哈勒克纹理特征以及专门为此开发的一组新特征。为测试这些特征的实用性,将它们用于训练神经网络分类器。该分类器能够正确识别平均83%的呈现十种模式之一的先前未见过的细胞。然后使用相同的分类器以98%的准确率识别先前未见过的均匀制备的细胞集。
算法使用商业产品Matlab、S-Plus和SAS实现,以及一些用C编写的函数。为这项工作生成的脚本和源代码可在http://murphylab.web.cmu.edu/software获取。