Center for Theragnosis, Korea Institute of Science and Technology, Seoul, 02792, Republic of Korea.
Department of Biological Chemistry, Korea University of Science and Technology, Daejeon, 34113, Republic of Korea.
Sci Rep. 2017 Jul 26;7(1):6599. doi: 10.1038/s41598-017-06314-9.
Various forms of protein (proteoforms) are generated by genetic variations, alternative splicing, alternative translation initiation, co- or post-translational modification and proteolysis. Different proteoforms are in part discovered by characterizing their N-terminal sequences. Here, we introduce an N-terminal-peptide-enrichment method, Nrich. Filter-aided negative selection formed the basis for the use of two N-blocking reagents and two endoproteases in this method. We identified 6,525 acetylated (or partially acetylated) and 6,570 free protein N-termini arising from 5,727 proteins in HEK293T human cells. The protein N-termini included translation initiation sites annotated in the UniProtKB database, putative alternative translational initiation sites, and N-terminal sites exposed after signal/transit/pro-peptide removal or unknown processing, revealing various proteoforms in cells. In addition, 46 novel protein N-termini were identified in 5' untranslated region (UTR) sequence with pseudo start codons. Our data showing the observation of N-terminal sequences of mature proteins constitutes a useful resource that may provide information for a better understanding of various proteoforms in cells.
各种形式的蛋白质(蛋白异构体)是通过遗传变异、选择性剪接、选择性翻译起始、翻译后修饰和蛋白水解产生的。不同的蛋白异构体部分是通过其 N 端序列的特征来发现的。在这里,我们介绍了一种 N 端肽富集方法,Nrich。滤过辅助的负选择构成了该方法使用两种 N 封闭试剂和两种内切蛋白酶的基础。我们在 HEK293T 人细胞中鉴定出了 6525 个乙酰化(或部分乙酰化)和 6570 个游离蛋白 N 端,这些 N 端来自 5727 个蛋白质。这些蛋白 N 端包括在 UniProtKB 数据库中注释的翻译起始位点、假定的选择性翻译起始位点,以及在信号/转运/前肽去除或未知加工后暴露的 N 端位点,揭示了细胞中的各种蛋白异构体。此外,在 5'非翻译区(UTR)序列中带有伪起始密码子的情况下,我们还鉴定出了 46 个新的蛋白 N 端。我们的数据显示了成熟蛋白 N 端序列的观察结果,这构成了一个有用的资源,可能为更好地理解细胞中的各种蛋白异构体提供信息。