Palma Alessandro, Buonaiuto Giulia, Ballarino Monica, Laneve Pietro
Department of Biology and Biotechnologies "Charles Darwin", Sapienza University of Rome, Piazzale Aldo Moro 5, Rome 00185, Italy.
Institute of Molecular Biology and Pathology, National Research Council of Italy, Piazzale Aldo Moro 7, Rome 00185, Italy.
Comput Struct Biotechnol J. 2025 Jan 31;27:575-584. doi: 10.1016/j.csbj.2025.01.026. eCollection 2025.
Long non-coding RNAs (lncRNAs) represent a groundbreaking class of RNA molecules that exert regulatory functions with remarkable tissue and cellular specificity. Although the identification of functionally significant lncRNAs is increasing, a comprehensive profiling of their genomic features remains elusive. Here, we present a detailed overview of the distribution of lncRNA genes across human chromosomes and describe key RNA features-what we refer to as a "virtual lncRNA karyotype"-that provide insights into their biosynthesis and function. To achieve this, we leveraged existing human annotation files to construct a statistical genomic portrait of lncRNAs in comparison with protein-coding genes (PCGs). We found that lncRNAs are unevenly distributed across chromosomes and identified regions of high lncRNA density on chromosomes 18, 13, and X, which overlap with PCG-rich regions. Additionally, we observed that lncRNAs generally exhibit shorter gene lengths and fewer splicing variants compared to protein-coding transcripts, with a subset displaying pronounced clustering patterns that may indicate functional relevance. Finally, we identified several clinically associated and experimentally validated SNPs impacting lncRNA genes (lncGs). Overall, this study provides a foundational reference for exploring the non-coding genome, offering new insights into the genomic characteristics of lncRNAs. These findings may enhance our understanding of their biological significance and potential roles in disease.
长链非编码RNA(lncRNA)是一类具有开创性的RNA分子,它们发挥着具有显著组织和细胞特异性的调控功能。尽管功能上具有重要意义的lncRNA的鉴定数量在不断增加,但对其基因组特征进行全面分析仍然难以实现。在此,我们详细概述了lncRNA基因在人类染色体上的分布,并描述了关键的RNA特征——我们称之为“虚拟lncRNA核型”——这些特征为其生物合成和功能提供了见解。为了实现这一点,我们利用现有的人类注释文件,构建了lncRNA与蛋白质编码基因(PCG)相比的统计基因组图谱。我们发现lncRNA在染色体上分布不均,并在18号、13号和X染色体上确定了lncRNA高密度区域,这些区域与富含PCG的区域重叠。此外,我们观察到与蛋白质编码转录本相比,lncRNA通常表现出更短的基因长度和更少的剪接变体,其中一部分显示出明显的聚类模式,这可能表明其功能相关性。最后,我们鉴定了几个影响lncRNA基因(lncG)的临床相关且经过实验验证的单核苷酸多态性(SNP)。总的来说,这项研究为探索非编码基因组提供了基础参考,为lncRNA的基因组特征提供了新的见解。这些发现可能会增强我们对其生物学意义以及在疾病中的潜在作用的理解。
Comput Struct Biotechnol J. 2025-1-31
Front Cell Dev Biol. 2021-6-10
Noncoding RNA Res. 2025-1-13
Bioinformatics. 2015-7-15
Comput Struct Biotechnol J. 2023-9-29
Nature. 2023-10
Cell Mol Gastroenterol Hepatol. 2023
Nat Commun. 2023-6-8
Nat Rev Mol Cell Biol. 2023-6
Nucleic Acids Res. 2023-1-6
Nucleic Acids Res. 2023-1-6