Suppr超能文献

ImPLoc:一种基于免疫组化图像的蛋白质亚细胞定位预测的多实例深度学习模型。

ImPLoc: a multi-instance deep learning model for the prediction of protein subcellular localization based on immunohistochemistry images.

机构信息

Department of Computer Science and Engineering, Center for Brain-Like Computing and Machine Intelligence, Shanghai Jiao Tong University, Shanghai 200240, China.

Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai 200240, China.

出版信息

Bioinformatics. 2020 Apr 1;36(7):2244-2250. doi: 10.1093/bioinformatics/btz909.

Abstract

MOTIVATION

The tissue atlas of the human protein atlas (HPA) houses immunohistochemistry (IHC) images visualizing the protein distribution from the tissue level down to the cell level, which provide an important resource to study human spatial proteome. Especially, the protein subcellular localization patterns revealed by these images are helpful for understanding protein functions, and the differential localization analysis across normal and cancer tissues lead to new cancer biomarkers. However, computational tools for processing images in this database are highly underdeveloped. The recognition of the localization patterns suffers from the variation in image quality and the difficulty in detecting microscopic targets.

RESULTS

We propose a deep multi-instance multi-label model, ImPLoc, to predict the subcellular locations from IHC images. In this model, we employ a deep convolutional neural network-based feature extractor to represent image features, and design a multi-head self-attention encoder to aggregate multiple feature vectors for subsequent prediction. We construct a benchmark dataset of 1186 proteins including 7855 images from HPA and 6 subcellular locations. The experimental results show that ImPLoc achieves significant enhancement on the prediction accuracy compared with the current computational methods. We further apply ImPLoc to a test set of 889 proteins with images from both normal and cancer tissues, and obtain 8 differentially localized proteins with a significance level of 0.05.

AVAILABILITY AND IMPLEMENTATION

https://github.com/yl2019lw/ImPloc.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

人类蛋白质图谱(HPA)的组织图谱存储了免疫组织化学(IHC)图像,这些图像从组织水平到细胞水平可视化了蛋白质的分布,为研究人类空间蛋白质组学提供了重要资源。特别是,这些图像揭示的蛋白质亚细胞定位模式有助于理解蛋白质的功能,而在正常组织和癌症组织之间的差异定位分析则导致了新的癌症生物标志物。然而,该数据库中图像的处理工具还远远不够发达。定位模式的识别受到图像质量变化和检测微观目标困难的影响。

结果

我们提出了一种深度多实例多标签模型 ImPLoc,用于从 IHC 图像预测亚细胞位置。在这个模型中,我们采用基于深度卷积神经网络的特征提取器来表示图像特征,并设计了多头自注意力编码器来聚合多个特征向量,以进行后续预测。我们构建了一个基准数据集,包含来自 HPA 的 7855 张图像和 6 个亚细胞位置的 1186 种蛋白质。实验结果表明,与现有的计算方法相比,ImPLoc 在预测精度上有显著提高。我们进一步将 ImPLoc 应用于来自正常和癌症组织的 889 种蛋白质的测试集,获得了 8 种具有统计学意义(0.05)的差异定位蛋白质。

可用性和实现

https://github.com/yl2019lw/ImPloc。

补充信息

补充数据可在生物信息学在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验