Suppr超能文献

基于免疫组化图像的细胞类型特异性自监督学习。

Self-supervised learning of cell type specificity from immunohistochemical images.

机构信息

Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.

Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA.

出版信息

Bioinformatics. 2022 Jun 24;38(Suppl 1):i395-i403. doi: 10.1093/bioinformatics/btac263.

Abstract

MOTIVATION

Advances in bioimaging now permit in situ proteomic characterization of cell-cell interactions in complex tissues, with important applications across a spectrum of biological problems from development to disease. These methods depend on selection of antibodies targeting proteins that are expressed specifically in particular cell types. Candidate marker proteins are often identified from single-cell transcriptomic data, with variable rates of success, in part due to divergence between expression levels of proteins and the genes that encode them. In principle, marker identification could be improved by using existing databases of immunohistochemistry for thousands of antibodies in human tissue, such as the Human Protein Atlas. However, these data lack detailed annotations of the types of cells in each image.

RESULTS

We develop a method to predict cell type specificity of protein markers from unlabeled images. We train a convolutional neural network with a self-supervised objective to generate embeddings of the images. Using non-linear dimensionality reduction, we observe that the model clusters images according to cell types and anatomical regions for which the stained proteins are specific. We then use estimates of cell type specificity derived from an independent single-cell transcriptomics dataset to train an image classifier, without requiring any human labelling of images. Our scheme demonstrates superior classification of known proteomic markers in kidney compared to selection via single-cell transcriptomics.

AVAILABILITY AND IMPLEMENTATION

Code and trained model are available at www.github.com/murphy17/HPA-SimCLR.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

生物成像技术的进步现在允许在复杂组织中对细胞间相互作用进行原位蛋白质组学特征分析,在从发育到疾病的一系列生物学问题中都有重要的应用。这些方法依赖于选择针对特定细胞类型特异性表达的蛋白质的抗体。候选标记蛋白通常是从单细胞转录组数据中识别出来的,成功率不一,部分原因是蛋白质表达水平与编码它们的基因之间存在差异。原则上,通过使用现有的人组织中数千种抗体的免疫组织化学数据库(如人类蛋白质图谱),可以改善标记物的识别。然而,这些数据缺乏对每张图像中细胞类型的详细注释。

结果

我们开发了一种从未标记图像预测蛋白质标记物细胞类型特异性的方法。我们使用自我监督的目标训练卷积神经网络,以生成图像的嵌入。通过非线性降维,我们观察到该模型根据染色蛋白特异性的细胞类型和解剖区域对图像进行聚类。然后,我们使用来自独立的单细胞转录组学数据集的细胞类型特异性估计值来训练图像分类器,而无需对图像进行任何人工标记。与通过单细胞转录组学选择相比,我们的方案在肾脏中对已知蛋白质组学标记物的分类表现出更高的性能。

可用性和实现

代码和训练模型可在 www.github.com/murphy17/HPA-SimCLR 上获得。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/63e2/9235491/f32c25836aa7/btac263f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验