IEEE/ACM Trans Comput Biol Bioinform. 2020 Nov-Dec;17(6):1966-1980. doi: 10.1109/TCBB.2019.2917429. Epub 2020 Dec 8.
Prediction of protein subcellular location has currently become a hot topic because it has been proven to be useful for understanding both the disease mechanisms and novel drug design. With the rapid development of automated microscopic imaging technology in recent years, classification methods of bioimage-based protein subcellular location have attracted considerable attention for images can describe the protein distribution intuitively and in detail. In the current study, a prediction method of protein subcellular location was proposed based on multi-view image features that are extracted from three different views, including the four texture features of the original image, the global and local features of the protein extracted from the protein channel images after color segmentation, and the global features of DNA extracted from the DNA channel image. Finally, the extracted features were combined together to improve the performance of subcellular localization prediction. From the performance comparison of different combination features under the same classifier, the best ensemble features could be obtained. In this work, a classifier based on Stacked Auto-encoders and the random forest was also put forward. To improve the prediction results, the deep network was combined with the traditional statistical classification methods. Stringent cross-validation and independent validation tests on the benchmark dataset demonstrated the efficacy of the proposed method.
蛋白质亚细胞位置的预测目前已经成为一个热点,因为它已经被证明对于理解疾病机制和新的药物设计非常有用。近年来,随着自动化显微镜成像技术的快速发展,基于生物图像的蛋白质亚细胞位置分类方法已经引起了相当大的关注,因为图像可以直观和详细地描述蛋白质的分布。在本研究中,提出了一种基于多视图图像特征的蛋白质亚细胞位置预测方法,这些特征是从三个不同的视图中提取的,包括原始图像的四个纹理特征、从颜色分割后的蛋白质通道图像中提取的蛋白质的全局和局部特征以及从 DNA 通道图像中提取的 DNA 的全局特征。最后,将提取的特征组合在一起,以提高亚细胞定位预测的性能。从同一分类器下不同组合特征的性能比较来看,可以得到最佳的集成特征。在这项工作中,还提出了一种基于堆叠自动编码器和随机森林的分类器。为了提高预测结果,将深度网络与传统的统计分类方法相结合。在基准数据集上进行的严格交叉验证和独立验证测试证明了所提出方法的有效性。