Faculty of Science and Engineering, Waseda University, 55N-06-10, 3-4-1 Okubo Shinjuku-ku, Tokyo, 169-8555, Japan.
AIST-Waseda University Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), 3-4-1, Okubo Shinjuku-ku, Tokyo, 169-8555, Japan.
BMC Genomics. 2018 Dec 31;19(Suppl 10):906. doi: 10.1186/s12864-018-5275-8.
With the increasing number of annotated long noncoding RNAs (lncRNAs) from the genome, researchers are continually updating their understanding of lncRNAs. Recently, thousands of lncRNAs have been reported to be associated with ribosomes in mammals. However, their biological functions or mechanisms are still unclear.
In this study, we tried to investigate the sequence features involved in the ribosomal association of lncRNA. We have extracted ninety-nine sequence features corresponding to different biological mechanisms (i.e., RNA splicing, putative ORF, k-mer frequency, RNA modification, RNA secondary structure, and repeat element). An [Formula: see text]-regularized logistic regression model was applied to screen these features. Finally, we obtained fifteen and nine important features for the ribosomal association of human and mouse lncRNAs, respectively.
To our knowledge, this is the first study to characterize ribosome-associated lncRNAs and ribosome-free lncRNAs from the perspective of sequence features. These sequence features that were identified in this study may shed light on the biological mechanism of the ribosomal association and provide important clues for functional analysis of lncRNAs.
随着基因组中注释长非编码 RNA(lncRNA)数量的增加,研究人员不断更新对 lncRNA 的理解。最近,据报道数千种 lncRNA 与哺乳动物的核糖体有关。然而,它们的生物学功能或机制尚不清楚。
在这项研究中,我们试图研究与 lncRNA 核糖体结合相关的序列特征。我们提取了九十九个对应于不同生物学机制的序列特征(即 RNA 剪接、假定 ORF、k-mer 频率、RNA 修饰、RNA 二级结构和重复元件)。应用[Formula: see text]-正则化逻辑回归模型对这些特征进行筛选。最后,我们分别获得了与人和鼠 lncRNA 核糖体结合相关的十五个和九个重要特征。
据我们所知,这是首次从序列特征的角度对核糖体结合的 lncRNA 和无核糖体的 lncRNA 进行特征描述。本研究中鉴定的这些序列特征可能有助于阐明核糖体结合的生物学机制,并为 lncRNA 的功能分析提供重要线索。