使用局部特征从显微镜图像中确定新蛋白质的亚细胞位置。

Determining the subcellular location of new proteins from microscope images using local features.

机构信息

Lane Center for Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.

出版信息

Bioinformatics. 2013 Sep 15;29(18):2343-9. doi: 10.1093/bioinformatics/btt392. Epub 2013 Jul 8.

DOI:10.1093/bioinformatics/btt392

PMID:23836142

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3753569/

Abstract

MOTIVATION

Evaluation of previous systems for automated determination of subcellular location from microscope images has been done using datasets in which each location class consisted of multiple images of the same representative protein. Here, we frame a more challenging and useful problem where previously unseen proteins are to be classified.

RESULTS

Using CD-tagging, we generated two new image datasets for evaluation of this problem, which contain several different proteins for each location class. Evaluation of previous methods on these new datasets showed that it is much harder to train a classifier that generalizes across different proteins than one that simply recognizes a protein it was trained on. We therefore developed and evaluated additional approaches, incorporating novel modifications of local features techniques. These extended the notion of local features to exploit both the protein image and any reference markers that were imaged in parallel. With these, we obtained a large accuracy improvement in our new datasets over existing methods. Additionally, these features help achieve classification improvements for other previously studied datasets.

AVAILABILITY

The datasets are available for download at http://murphylab.web.cmu.edu/data/. The software was written in Python and C++ and is available under an open-source license at http://murphylab.web.cmu.edu/software/. The code is split into a library, which can be easily reused for other data and a small driver script for reproducing all results presented here. A step-by-step tutorial on applying the methods to new datasets is also available at that address.

CONTACT

murphy@cmu.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

以前从显微镜图像中自动确定亚细胞位置的系统评估是使用每个位置类别的数据集进行的，其中每个位置类别都包含同一种代表性蛋白质的多个图像。在这里，我们提出了一个更具挑战性和实用性的问题，即需要对以前未见过的蛋白质进行分类。

结果

使用 CD 标记，我们生成了两个用于评估该问题的新图像数据集，每个位置类别包含几种不同的蛋白质。在这些新数据集中评估以前的方法表明，训练一个能够跨不同蛋白质泛化的分类器比简单地识别它所训练的蛋白质的分类器要困难得多。因此，我们开发并评估了其他方法，包括对局部特征技术的新颖修改。这些方法扩展了局部特征的概念，以利用蛋白质图像和同时成像的任何参考标记。通过这些方法，我们在新数据集上获得了比现有方法更高的准确性，同时也提高了其他先前研究的数据集的分类效果。

可用性

数据集可在 http://murphylab.web.cmu.edu/data/ 下载。软件是用 Python 和 C++ 编写的，并在开源许可证下在 http://murphylab.web.cmu.edu/software/ 上提供。代码分为一个库和一个小型驱动脚本，库可以很容易地用于其他数据，驱动脚本用于重现这里呈现的所有结果。在同一地址还提供了一个应用这些方法到新数据集的分步教程。

联系人

murphy@cmu.edu

补充信息

补充数据可在生物信息学在线获得。

相似文献

Determining the subcellular location of new proteins from microscope images using local features.使用局部特征从显微镜图像中确定新蛋白质的亚细胞位置。

Bioinformatics. 2013 Sep 15;29(18):2343-9. doi: 10.1093/bioinformatics/btt392. Epub 2013 Jul 8.

A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of HeLa cells.一种能够识别HeLa细胞荧光显微镜图像中所有主要亚细胞结构模式的神经网络分类器。

Bioinformatics. 2001 Dec;17(12):1213-23. doi: 10.1093/bioinformatics/17.12.1213.

Automated analysis of protein subcellular location in time series images.基于时间序列图像的蛋白质亚细胞位置的自动分析。

Bioinformatics. 2010 Jul 1;26(13):1630-6. doi: 10.1093/bioinformatics/btq239. Epub 2010 May 19.

Boosting accuracy of automated classification of fluorescence microscope images for location proteomics.提高用于定位蛋白质组学的荧光显微镜图像自动分类的准确性。

BMC Bioinformatics. 2004 Jun 18;5:78. doi: 10.1186/1471-2105-5-78.

Protein subcellular location pattern classification in cellular images using latent discriminative models.基于潜在判别模型的细胞图像中蛋白质亚细胞定位模式分类

Bioinformatics. 2012 Jun 15;28(12):i32-9. doi: 10.1093/bioinformatics/bts230.

Model building and intelligent acquisition with application to protein subcellular location classification.基于模型构建和智能获取的蛋白质亚细胞定位分类方法

Bioinformatics. 2011 Jul 1;27(13):1854-9. doi: 10.1093/bioinformatics/btr286. Epub 2011 May 9.

Automated image analysis of protein localization in budding yeast.芽殖酵母中蛋白质定位的自动化图像分析

Bioinformatics. 2007 Jul 1;23(13):i66-71. doi: 10.1093/bioinformatics/btm206.

Improved recognition of figures containing fluorescence microscope images in online journal articles using graphical models.使用图形模型提高在线期刊文章中含荧光显微镜图像的图形识别能力。

Bioinformatics. 2008 Feb 15;24(4):569-76. doi: 10.1093/bioinformatics/btm561. Epub 2007 Nov 22.

Quantifying the distribution of probes between subcellular locations using unsupervised pattern unmixing.使用无监督模式分解技术定量分析探针在亚细胞位置之间的分布。

Bioinformatics. 2010 Jun 15;26(12):i7-12. doi: 10.1093/bioinformatics/btq220.

Image-based spatiotemporal causality inference for protein signaling networks.基于图像的蛋白质信号网络时空因果推断。

Bioinformatics. 2017 Jul 15;33(14):i217-i224. doi: 10.1093/bioinformatics/btx258.

引用本文的文献

HAR_Locator: a novel protein subcellular location prediction model of immunohistochemistry images based on hybrid attention modules and residual units.HAR定位器：一种基于混合注意力模块和残差单元的免疫组织化学图像新型蛋白质亚细胞定位预测模型

Front Mol Biosci. 2023 Aug 17;10:1171429. doi: 10.3389/fmolb.2023.1171429. eCollection 2023.

Comparison of Different Convolutional Neural Network Activation Functions and Methods for Building Ensembles for Small to Midsize Medical Data Sets.比较不同卷积神经网络激活函数以及针对中小医疗数据集构建集成的方法。

Sensors (Basel). 2022 Aug 16;22(16):6129. doi: 10.3390/s22166129.

Nanoscale segregation of channel and barrier claudins enables paracellular ion flux.纳米尺度上的通道和屏障紧密连接蛋白的分隔使细胞旁离子通量成为可能。

Nat Commun. 2022 Aug 25;13(1):4985. doi: 10.1038/s41467-022-32533-4.

Subcellular proteomics.亚细胞蛋白质组学

Nat Rev Methods Primers. 2021;1. doi: 10.1038/s43586-021-00029-y. Epub 2021 Apr 29.

PScL-HDeep: image-based prediction of protein subcellular location in human tissue using ensemble learning of handcrafted and deep learned features with two-layer feature selection.PScL-HDeep：基于图像的人类组织蛋白亚细胞定位预测，使用基于手工和深度学习特征的两层特征选择的集成学习方法。

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab278.

Consistency and variation of protein subcellular location annotations.蛋白质亚细胞定位注释的一致性与变异性

Proteins. 2021 Feb;89(2):242-250. doi: 10.1002/prot.26010. Epub 2020 Sep 26.

MIC_Locator: a novel image-based protein subcellular location multi-label prediction model based on multi-scale monogenic signal representation and intensity encoding strategy.MIC_Locator：一种新颖的基于图像的蛋白质亚细胞位置多标签预测模型，基于多尺度单基因信号表示和强度编码策略。

BMC Bioinformatics. 2019 Oct 26;20(1):522. doi: 10.1186/s12859-019-3136-3.

Spatial proteomics: a powerful discovery tool for cell biology.空间蛋白质组学：细胞生物学的强大发现工具。

Nat Rev Mol Cell Biol. 2019 May;20(5):285-302. doi: 10.1038/s41580-018-0094-y.

Deep learning is combined with massive-scale citizen science to improve large-scale image classification.深度学习与大规模公民科学相结合，以改进大规模图像分类。

Nat Biotechnol. 2018 Oct;36(9):820-828. doi: 10.1038/nbt.4225. Epub 2018 Aug 20.

Data-analysis strategies for image-based cell profiling.基于图像的细胞分析中的数据分析策略。

Nat Methods. 2017 Aug 31;14(9):849-863. doi: 10.1038/nmeth.4397.

本文引用的文献

Automated analysis and reannotation of subcellular locations in confocal images from the Human Protein Atlas.自动分析和重新注释人类蛋白质图谱中来自共聚焦图像的亚细胞定位。

PLoS One. 2012;7(11):e50514. doi: 10.1371/journal.pone.0050514. Epub 2012 Nov 30.

Protein subcellular location pattern classification in cellular images using latent discriminative models.基于潜在判别模型的细胞图像中蛋白质亚细胞定位模式分类

Bioinformatics. 2012 Jun 15;28(12):i32-9. doi: 10.1093/bioinformatics/bts230.

Novel features for automated cell phenotype image classification.自动化细胞表型图像分类的新特征。

Adv Exp Med Biol. 2010;680:207-13. doi: 10.1007/978-1-4419-5913-3_24.

Evaluating color descriptors for object and scene recognition.评估用于目标和场景识别的颜色描述符。

IEEE Trans Pattern Anal Mach Intell. 2010 Sep;32(9):1582-96. doi: 10.1109/TPAMI.2009.154.

Quantifying the distribution of probes between subcellular locations using unsupervised pattern unmixing.使用无监督模式分解技术定量分析探针在亚细胞位置之间的分布。

Bioinformatics. 2010 Jun 15;26(12):i7-12. doi: 10.1093/bioinformatics/btq220.

Determining the distribution of probes between different subcellular locations through automated unmixing of subcellular patterns.通过自动分离亚细胞模式来确定探针在不同亚细胞位置的分布。

Proc Natl Acad Sci U S A. 2010 Feb 16;107(7):2944-9. doi: 10.1073/pnas.0912090107. Epub 2010 Feb 1.

Efficient framework for automated classification of subcellular patterns in budding yeast.用于芽殖酵母亚细胞模式自动分类的高效框架。

Cytometry A. 2009 Nov;75(11):934-40. doi: 10.1002/cyto.a.20793.

IICBU 2008: a proposed benchmark suite for biological image analysis.IICBU 2008：一个用于生物图像分析的提议基准套件。

Med Biol Eng Comput. 2008 Sep;46(9):943-7. doi: 10.1007/s11517-008-0380-5. Epub 2008 Jul 31.

Wndchrm - an open source utility for biological image analysis.Wndchrm - 一款用于生物图像分析的开源实用工具。

Source Code Biol Med. 2008 Jul 8;3:13. doi: 10.1186/1751-0473-3-13.

A reliable method for cell phenotype image classification.一种用于细胞表型图像分类的可靠方法。

Artif Intell Med. 2008 Jun;43(2):87-97. doi: 10.1016/j.artmed.2008.03.005. Epub 2008 Apr 28.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验