Dipartimento di Scienze e Tecnologie Biologiche Chimiche e Farmaceutiche (STeBiCeF), Viale delle Scienze, University of Palermo, Ed. 16, 90128, Palermo, PA, Italy.
Geroscience. 2019 Feb;41(1):39-49. doi: 10.1007/s11357-018-00050-2. Epub 2019 Jan 8.
Repetitive DNA sequences represent about half of the human genome. They have a central role in human biology, especially neurobiology, but are notoriously difficult to study. The purpose of this study was to quantify the transcription from repetitive sequences in a progerin-expressing cellular model of neuronal aging. Progerin is a nuclear protein causative of the Hutchinson-Gilford progeria syndrome that is also incrementally expressed during the normal aging process. A dedicated pipeline of analysis allowed to quantify transcripts containing repetitive sequences from RNAseq datasets oblivious of their genomic localization, tolerating a sufficient degree of mutational noise, all with low computational requirements. The pipeline has been applied to a published panel of RNAseq datasets derived from a well-established and well-described cellular model of aging of dopaminergic neurons. Progerin expression strongly downregulated the transcription from all the classes of repetitive sequences: satellites, long and short interspersed nuclear elements, human endogenous retroviruses, and DNA transposon. The Alu element represented by far the principal source of transcript originating either from repetitive sequences or from canonical coding genes; it was expressed on average at 192,493.5 reads per kilobase million (RPKM) (SE = 21,081.3) in the control neurons and dropped to 43,760.1 RPKM (SE = 5315.0) in the progerin-expressing neurons, being significant downregulated (p = 0.0005). The results highlighted a global perturbation of transcripts derived from repetitive sequences in a cellular model of aging and provided a direct link between progerin expression and alteration of transcription from human repetitive elements.
重复 DNA 序列约占人类基因组的一半。它们在人类生物学中具有核心作用,尤其是神经生物学,但研究起来极具挑战性。本研究旨在定量分析表达早衰蛋白的神经元衰老细胞模型中重复序列的转录情况。早衰蛋白是导致哈钦森-吉尔福德早衰综合征的核蛋白,在正常衰老过程中也会逐渐表达。我们开发了一种专用的分析流水线,可以从 RNAseq 数据集定量分析包含重复序列的转录本,而无需考虑其基因组定位,能够容忍一定程度的突变噪声,所有这些操作的计算需求都很低。该流水线已应用于一组已发表的 RNAseq 数据集,这些数据集源自多巴胺能神经元衰老的一个成熟且描述详尽的细胞模型。早衰蛋白的表达强烈下调了所有重复序列类别(卫星、长散在核元件、短散在核元件、内源性逆转录病毒和 DNA 转座子)的转录。Alu 元件是转录本的主要来源,无论是来自重复序列还是来自经典编码基因;在对照组神经元中,它的平均表达水平为每百万碱基读取数(RPKM)192493.5(标准差[SE]为 21081.3),在表达早衰蛋白的神经元中降至 43760.1 RPKM(SE 为 5315.0),表达显著下调(p=0.0005)。这些结果突出了衰老细胞模型中重复序列转录本的整体变化,并提供了早衰蛋白表达与人类重复元件转录改变之间的直接联系。