Vasu Kommireddy, Khan Debjit, Ramachandiran Iyappan, Blankenberg Daniel, Fox Paul L
Department of Cardiovascular and Metabolic Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA.
Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA.
NAR Genom Bioinform. 2022 Oct 19;4(4):lqac076. doi: 10.1093/nargab/lqac076. eCollection 2022 Dec.
Transcriptional and post-transcriptional mechanisms diversify the proteome beyond gene number, while maintaining a sequence relationship between original and altered proteins. A new mechanism breaks this paradigm, generating novel proteins by translating alternative open reading frames (Alt-ORFs) within canonical host mRNAs. Uniquely, 'alt-proteins' lack sequence homology with host ORF-derived proteins. We show global amino acid frequencies, and consequent biochemical characteristics of Alt-ORFs nested within host ORFs (nAlt-ORFs), are genetically-driven, and predicted by summation of frequencies of hundreds of encompassing host codon-pairs. Analysis of 101 human nAlt-ORFs of length ≥150 codons confirms the theoretical predictions, revealing an extraordinarily high median isoelectric point (pI) of 11.68, due to anomalous charged amino acid levels. Also, nAlt-ORF proteins exhibit a >2-fold preference for reading frame 2 versus 3, predicted mitochondrial and nuclear localization, and elevated codon adaptation index indicative of natural selection. Our results provide a theoretical and conceptual framework for exploration of these largely unannotated, but potentially significant, alternative ORFs and their encoded proteins.
转录和转录后机制使蛋白质组的多样性超越了基因数量,同时保持了原始蛋白质与变异蛋白质之间的序列关系。一种新机制打破了这一模式,通过翻译经典宿主mRNA内的可变开放阅读框(Alt-ORF)产生新的蛋白质。独特的是,“替代蛋白”与宿主ORF衍生的蛋白质缺乏序列同源性。我们发现,宿主ORF内嵌套的Alt-ORF(nAlt-ORF)的全局氨基酸频率以及由此产生的生化特征是由基因驱动的,并可通过数百个包含宿主密码子对的频率总和进行预测。对101个长度≥150个密码子的人类nAlt-ORF的分析证实了理论预测,由于异常的带电氨基酸水平,其等电点(pI)中位数高达11.68。此外,nAlt-ORF蛋白对阅读框2的偏好是阅读框3的两倍多,预测其定位于线粒体和细胞核,并且密码子适应指数升高表明存在自然选择。我们的结果为探索这些大多未注释但可能具有重要意义的可变ORF及其编码蛋白提供了一个理论和概念框架。