Suppr超能文献

HAR定位器:一种基于混合注意力模块和残差单元的免疫组织化学图像新型蛋白质亚细胞定位预测模型

HAR_Locator: a novel protein subcellular location prediction model of immunohistochemistry images based on hybrid attention modules and residual units.

作者信息

Zou Kai, Wang Simeng, Wang Ziqian, Zhang Zhihai, Yang Fan

机构信息

School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China.

Artificial Intelligence and Bioinformation Cognition Laboratory, Jiangxi Science and Technology Normal University, Nanchang, China.

出版信息

Front Mol Biosci. 2023 Aug 17;10:1171429. doi: 10.3389/fmolb.2023.1171429. eCollection 2023.

Abstract

Proteins located in subcellular compartments have played an indispensable role in the physiological function of eukaryotic organisms. The pattern of protein subcellular localization is conducive to understanding the mechanism and function of proteins, contributing to investigating pathological changes of cells, and providing technical support for targeted drug research on human diseases. Automated systems based on featurization or representation learning and classifier design have attracted interest in predicting the subcellular location of proteins due to a considerable rise in proteins. However, large-scale, fine-grained protein microscopic images are prone to trapping and losing feature information in the general deep learning models, and the shallow features derived from statistical methods have weak supervision abilities. In this work, a novel model called HAR_Locator was developed to predict the subcellular location of proteins by concatenating multi-view abstract features and shallow features, whose advanced advantages are summarized in the following three protocols. Firstly, to get discriminative abstract feature information on protein subcellular location, an abstract feature extractor called HARnet based on Hybrid Attention modules and Residual units was proposed to relieve gradient dispersion and focus on protein-target regions. Secondly, it not only improves the supervision ability of image information but also enhances the generalization ability of the HAR_Locator through concatenating abstract features and shallow features. Finally, a multi-category multi-classifier decision system based on an Artificial Neural Network (ANN) was introduced to obtain the final output results of samples by fitting the most representative result from five subset predictors. To evaluate the model, a collection of 6,778 immunohistochemistry (IHC) images from the Human Protein Atlas (HPA) database was used to present experimental results, and the accuracy, precision, and recall evaluation indicators were significantly increased to 84.73%, 84.77%, and 84.70%, respectively, compared with baseline predictors.

摘要

位于亚细胞区室的蛋白质在真核生物的生理功能中发挥着不可或缺的作用。蛋白质亚细胞定位模式有助于理解蛋白质的机制和功能,有助于研究细胞的病理变化,并为人类疾病的靶向药物研究提供技术支持。由于蛋白质数量大幅增加,基于特征提取或表示学习以及分类器设计的自动化系统在预测蛋白质亚细胞定位方面引起了关注。然而,大规模、细粒度的蛋白质微观图像在一般深度学习模型中容易陷入并丢失特征信息,并且源自统计方法的浅层特征监督能力较弱。在这项工作中,开发了一种名为HAR_Locator的新型模型,通过拼接多视图抽象特征和浅层特征来预测蛋白质的亚细胞定位,其先进优势总结为以下三个方案。首先,为了获得关于蛋白质亚细胞定位的判别性抽象特征信息,提出了一种基于混合注意力模块和残差单元的抽象特征提取器HARnet,以缓解梯度弥散并聚焦于蛋白质目标区域。其次,它不仅提高了图像信息的监督能力,还通过拼接抽象特征和浅层特征增强了HAR_Locator的泛化能力。最后,引入了一种基于人工神经网络(ANN)的多类别多分类器决策系统,通过拟合五个子集预测器中最具代表性的结果来获得样本的最终输出结果。为了评估该模型,使用了来自人类蛋白质图谱(HPA)数据库的6778张免疫组织化学(IHC)图像来展示实验结果,与基线预测器相比,准确率、精确率和召回率评估指标分别显著提高到84.73%、84.77%和84.70%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2309/10470064/8a908c91fb25/fmolb-10-1171429-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验