Suppr超能文献

基于免疫组化图像和蛋白质序列的双信号特征空间映射蛋白质亚细胞定位。

Dual-Signal Feature Spaces Map Protein Subcellular Locations Based on Immunohistochemistry Image and Protein Sequence.

机构信息

School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330038, China.

School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China.

出版信息

Sensors (Basel). 2023 Nov 7;23(22):9014. doi: 10.3390/s23229014.

Abstract

Protein is one of the primary biochemical macromolecular regulators in the compartmental cellular structure, and the subcellular locations of proteins can therefore provide information on the function of subcellular structures and physiological environments. Recently, data-driven systems have been developed to predict the subcellular location of proteins based on protein sequence, immunohistochemistry (IHC) images, or immunofluorescence (IF) images. However, the research on the fusion of multiple protein signals has received little attention. In this study, we developed a dual-signal computational protocol by incorporating IHC images into protein sequences to learn protein subcellular localization. Three major steps can be summarized as follows in this protocol: first, a benchmark database that includes 281 proteins sorted out from 4722 proteins of the Human Protein Atlas (HPA) and Swiss-Prot database, which is involved in the endoplasmic reticulum (ER), Golgi apparatus, cytosol, and nucleoplasm; second, discriminative feature operators were first employed to quantitate protein image-sequence samples that include IHC images and protein sequence; finally, the feature subspace of different protein signals is absorbed to construct multiple sub-classifiers via dimensionality reduction and binary relevance (BR), and multiple confidence derived from multiple sub-classifiers is adopted to decide subcellular location by the centralized voting mechanism at the decision layer. The experimental results indicated that the dual-signal model embedded IHC images and protein sequences outperformed the single-signal models with accuracy, precision, and recall of 75.41%, 80.38%, and 74.38%, respectively. It is enlightening for further research on protein subcellular location prediction under multi-signal fusion of protein.

摘要

蛋白质是细胞结构区室化的主要生物化学大分子调控因子之一,因此蛋白质的亚细胞定位可以提供有关亚细胞结构和生理环境功能的信息。最近,已经开发了基于蛋白质序列、免疫组织化学(IHC)图像或免疫荧光(IF)图像来预测蛋白质亚细胞定位的数据驱动系统。然而,关于融合多种蛋白质信号的研究却很少受到关注。在这项研究中,我们通过将 IHC 图像纳入蛋白质序列中,开发了一种双信号计算方案,用于学习蛋白质亚细胞定位。该方案可以概括为以下三个主要步骤:首先,从人类蛋白质图谱(HPA)和 Swiss-Prot 数据库的 4722 种蛋白质中筛选出 281 种蛋白质,建立一个基准数据库,这些蛋白质涉及内质网(ER)、高尔基体、细胞质和核质;其次,首先使用判别特征算子对包括 IHC 图像和蛋白质序列的蛋白质图像序列样本进行量化;最后,通过降维和二进制相关性(BR)吸收不同蛋白质信号的特征子空间,通过决策层的集中投票机制,利用多个子分类器的多个置信度来决定亚细胞位置。实验结果表明,嵌入 IHC 图像和蛋白质序列的双信号模型在准确性、精度和召回率方面均优于单信号模型,分别为 75.41%、80.38%和 74.38%。这对于进一步研究蛋白质在多信号融合下的亚细胞定位预测具有启发意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cb47/10675401/1aa36b2f8ad4/sensors-23-09014-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验