Suppr超能文献

蛋白质亚细胞定位注释的一致性与变异性

Consistency and variation of protein subcellular location annotations.

作者信息

Xu Ying-Ying, Zhou Hang, Murphy Robert F, Shen Hong-Bin

机构信息

School of Biomedical Engineering, Southern Medical University, Guangzhou, China.

Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China.

出版信息

Proteins. 2021 Feb;89(2):242-250. doi: 10.1002/prot.26010. Epub 2020 Sep 26.

Abstract

A major challenge for protein databases is reconciling information from diverse sources. This is especially difficult when some information consists of secondary, human-interpreted rather than primary data. For example, the Swiss-Prot database contains curated annotations of subcellular location that are based on predictions from protein sequence, statements in scientific articles, and published experimental evidence. The Human Protein Atlas (HPA) consists of millions of high-resolution microscopic images that show protein spatial distribution on a cellular and subcellular level. These images are manually annotated with protein subcellular locations by trained experts. The image annotations in HPA can capture the variation of subcellular location across different cell lines, tissues, or tissue states. Systematic investigation of the consistency between HPA and Swiss-Prot assignments of subcellular location, which is important for understanding and utilizing protein location data from the two databases, has not been described previously. In this paper, we quantitatively evaluate the consistency of subcellular location annotations between HPA and Swiss-Prot at multiple levels, as well as variation of protein locations across cell lines and tissues. Our results show that annotations of these two databases differ significantly in many cases, leading to proposed procedures for deriving and integrating the protein subcellular location data. We also find that proteins having highly variable locations are more likely to be biomarkers of diseases, providing support for incorporating analysis of subcellular location in protein biomarker identification and screening.

摘要

蛋白质数据库面临的一个主要挑战是协调来自不同来源的信息。当某些信息是二级的、人为解读而非原始数据时,这尤其困难。例如,Swiss-Prot数据库包含基于蛋白质序列预测、科学文章中的陈述以及已发表实验证据的亚细胞定位的人工注释。人类蛋白质图谱(HPA)由数百万张高分辨率显微图像组成,这些图像展示了蛋白质在细胞和亚细胞水平上的空间分布。这些图像由训练有素的专家手动标注蛋白质亚细胞定位。HPA中的图像注释可以捕捉不同细胞系、组织或组织状态下亚细胞定位的变化。此前尚未描述对HPA和Swiss-Prot亚细胞定位分配之间的一致性进行系统研究,而这种一致性对于理解和利用来自这两个数据库的蛋白质定位数据很重要。在本文中,我们在多个层面定量评估了HPA和Swiss-Prot之间亚细胞定位注释的一致性,以及蛋白质在不同细胞系和组织中的定位变化。我们的结果表明,这两个数据库的注释在许多情况下存在显著差异,从而引出了推导和整合蛋白质亚细胞定位数据的建议程序。我们还发现,定位高度可变的蛋白质更有可能是疾病的生物标志物,这为在蛋白质生物标志物识别和筛选中纳入亚细胞定位分析提供了支持。

相似文献

引用本文的文献

本文引用的文献

2
Analysis of the Human Protein Atlas Image Classification competition.人类蛋白质图谱图像分类竞赛分析。
Nat Methods. 2019 Dec;16(12):1254-1261. doi: 10.1038/s41592-019-0658-6. Epub 2019 Nov 28.
7
In silico re-identification of properties of drug target proteins.药物靶蛋白特性的计算机再鉴定
BMC Bioinformatics. 2017 May 31;18(Suppl 7):248. doi: 10.1186/s12859-017-1639-3.
8
A subcellular map of the human proteome.人类蛋白质组的亚细胞图谱。
Science. 2017 May 26;356(6340). doi: 10.1126/science.aal3321. Epub 2017 May 11.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验