Institute of Cytology and Genetics, SB RAS, Novosibirsk, Russia.
Novosibirsk State Agrarian University, Novosibirsk, Russia.
Methods Mol Biol. 2025;2859:319-331. doi: 10.1007/978-1-0716-4152-1_18.
It is widely discussed that eukaryotic mRNAs can encode several functional polypeptides. Recent progress in NGS and proteomics techniques has resulted in a huge volume of information on potential alternative translation initiation sites and open reading frames (altORFs). However, these data are still incomprehensive, and the vast majority of eukaryotic mRNAs annotated in conventional databases (e.g., GenBank) contain a single ORF (CDS) encoding a protein larger than some arbitrary threshold (commonly 100 amino acid residues). Indeed, some gene functions may relate to the polypeptides encoded by unannotated altORFs, and insufficient information in nucleotide sequence databanks may limit the interpretation of genomics and transcriptomics data. However, despite the need for special experiments to predict altORFs accurately, there are some simple methods for their preliminary mapping.
人们广泛讨论真核生物的 mRNA 可以编码几种具有不同功能的多肽。最近,高通量测序和蛋白质组学技术取得了重大进展,获得了大量关于潜在的选择性翻译起始位点和开放阅读框(altORFs)的信息。然而,这些数据仍然不完整,在传统数据库(例如 GenBank)中注释的绝大多数真核生物 mRNA 仅包含一个编码蛋白质的 ORF(CDS),其长度大于某个任意阈值(通常为 100 个氨基酸残基)。事实上,一些基因功能可能与未注释的 altORFs 编码的多肽有关,核苷酸序列数据库中的信息不足可能会限制对基因组学和转录组学数据的解释。但是,尽管需要特殊的实验来准确预测 altORFs,但仍有一些简单的方法可以对其进行初步定位。