Suppr超能文献

它们为何缺失?:人类缺失蛋白质的生物信息学特征分析

Why are they missing? : Bioinformatics characterization of missing human proteins.

作者信息

Elguoshy Amr, Magdeldin Sameh, Xu Bo, Hirao Yoshitoshi, Zhang Ying, Kinoshita Naohiko, Takisawa Yusuke, Nameta Masaaki, Yamamoto Keiko, El-Refy Ali, El-Fiky Fawzy, Yamamoto Tadashi

机构信息

Biofluid Biomarker Center, Institute of Social innovation and Co-operation, Niigata University, Niigata 951-2181, Japan; Biotechnology Department, Faculty of Agriculture, Al-Azhar University, Cairo 11682, Egypt.

Biofluid Biomarker Center, Institute of Social innovation and Co-operation, Niigata University, Niigata 951-2181, Japan; Department of Physiology, Faculty of Veterinary Medicine, Suez Canal University, Ismailia 41522, Egypt.

出版信息

J Proteomics. 2016 Oct 21;149:7-14. doi: 10.1016/j.jprot.2016.08.005. Epub 2016 Aug 13.

Abstract

NeXtProt is a web-based protein knowledge platform that supports research on human proteins. NeXtProt (release 2015-04-28) lists 20,060 proteins, among them, 3373 canonical proteins (16.8%) lack credible experimental evidence at protein level (PE2:PE5). Therefore, they are considered as "missing proteins". A comprehensive bioinformatic workflow has been proposed to analyze these "missing" proteins. The aims of current study were to analyze physicochemical properties, existence and distribution of the tryptic cleavage sites, and to pinpoint the signature peptides of the missing proteins. Our findings showed that 23.7% of missing proteins were hydrophobic proteins possessing transmembrane domains (TMD). Also, forty missing entries generate tryptic peptides were either out of mass detection range (>30aa) or mapped to different proteins (<9aa). Additionally, 21% of missing entries didn't generate any unique tryptic peptides. In silico endopeptidase combination strategy increased the possibility of missing proteins identification. Coherently, using both mature protein database and signal peptidome database could be a promising option to identify some missing proteins by targeting their unique N-terminal tryptic peptide from mature protein database and or C-terminus tryptic peptide from signal peptidome database. In conclusion, Identification of missing protein requires additional consideration during sample preparation, extraction, digestion and data analysis to increase its incidence of identification.

摘要

NeXtProt是一个基于网络的蛋白质知识平台,支持对人类蛋白质的研究。NeXtProt(2015年4月28日版本)列出了20,060种蛋白质,其中3373种标准蛋白质(16.8%)在蛋白质水平(PE2:PE5)缺乏可靠的实验证据。因此,它们被视为“缺失蛋白质”。已提出一种全面的生物信息学工作流程来分析这些“缺失”蛋白质。本研究的目的是分析其物理化学性质、胰蛋白酶切割位点的存在和分布,并确定缺失蛋白质的特征肽段。我们的研究结果表明,23.7%的缺失蛋白质是具有跨膜结构域(TMD)的疏水蛋白质。此外,四十个缺失条目的胰蛋白酶肽段要么超出质量检测范围(>30个氨基酸),要么映射到不同的蛋白质(<9个氨基酸)。另外,21%的缺失条目没有产生任何独特的胰蛋白酶肽段。计算机模拟的内肽酶组合策略增加了识别缺失蛋白质的可能性。连贯地,使用成熟蛋白质数据库和信号肽组数据库可能是一种有前景的选择,通过靶向成熟蛋白质数据库中其独特的N端胰蛋白酶肽段和/或信号肽组数据库中的C端胰蛋白酶肽段来识别一些缺失蛋白质。总之,在样品制备、提取、消化和数据分析过程中,识别缺失蛋白质需要额外考虑,以提高其识别率。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验