Adam Mickiewicz University in Poznan, Institute of Anthropology, Laboratory of Integrative Genomics, Uniwersytetu Poznańskiego 6, 61-614 Poznan, Poland; Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine, Hannoversche Str. 28, 10115 Berlin, Germany.
Adam Mickiewicz University in Poznan, Institute of Anthropology, Laboratory of Integrative Genomics, Uniwersytetu Poznańskiego 6, 61-614 Poznan, Poland.
Biochim Biophys Acta Gene Regul Mech. 2020 Apr;1863(4):194385. doi: 10.1016/j.bbagrm.2019.05.003. Epub 2019 May 22.
A substantial fraction of the human transcriptome is composed of the so-called long noncoding RNAs (lncRNAs), yet the available catalogs of known lncRNAs are far from complete. Moreover, functional studies of these RNAs are challenged by several factors, such as their tissue-specific expression and functional heterogeneity, resulting in only ca. 1% of them being well characterized. Here, we describe a set of 41,400 novel lncRNAs discovered with RNA-Seq data from 1463 samples encompassing diverse tissues and cell lines. We utilized publicly available transcriptomic and genomic data to provide their characteristics, such as tissue specificity, cellular abundance, polyA status, cellular localization, evolutionary conservation and transcript stability, which allowed us to speculate on their possible biological roles. We also pinpointed 24 novel lncRNAs as candidates for breast cancer biomarkers. The results bring us closer to a comprehensive annotation of human lncRNAs, though vast amounts of further work are needed to validate the predictions and fully decipher their biology. This article is part of a Special Issue entitled: ncRNA in control of gene expression edited by Kotb Abdelmohsen.
人类转录组的很大一部分由所谓的长非编码 RNA(lncRNA)组成,但已知 lncRNA 的可用目录还远远不够完整。此外,这些 RNA 的功能研究受到多种因素的挑战,例如它们的组织特异性表达和功能异质性,导致只有约 1%的 lncRNA 得到了很好的描述。在这里,我们描述了一组 41400 个新的 lncRNA,这些 lncRNA 是利用来自 1463 个样本的 RNA-Seq 数据发现的,这些样本涵盖了多种组织和细胞系。我们利用公开的转录组和基因组数据来提供它们的特征,如组织特异性、细胞丰度、polyA 状态、细胞定位、进化保守性和转录稳定性,这使我们能够推测它们可能的生物学作用。我们还确定了 24 个新的 lncRNA 作为乳腺癌生物标志物的候选物。这些结果使我们更接近对人类 lncRNA 的全面注释,但还需要大量进一步的工作来验证预测并完全破译它们的生物学功能。本文是由 Kotb Abdelmohsen 编辑的题为“ncRNA 在基因表达调控中的作用”的特刊的一部分。