Gudenas Brian L, Srivastava Anand K, Wang Liangjiang
Department of Genetics and Biochemistry, Clemson University, Clemson, South Carolina, United States of America.
J.C. Self Research Institute of Human Genetics, Greenwood Genetic Center, Greenwood, South Carolina, United States of America.
PLoS One. 2017 May 31;12(5):e0178532. doi: 10.1371/journal.pone.0178532. eCollection 2017.
Genetic studies have identified many risk loci for autism spectrum disorder (ASD) although causal factors in the majority of cases are still unknown. Currently, known ASD risk genes are all protein-coding genes; however, the vast majority of transcripts in humans are non-coding RNAs (ncRNAs) which do not encode proteins. Recently, long non-coding RNAs (lncRNAs) were shown to be highly expressed in the human brain and crucial for normal brain development. We have constructed a computational pipeline for the integration of various genomic datasets to identify lncRNAs associated with ASD. This pipeline utilizes differential gene expression patterns in affected tissues in conjunction with gene co-expression networks in tissue-matched non-affected samples. We analyzed RNA-seq data from the cortical brain tissues from ASD cases and controls to identify lncRNAs differentially expressed in ASD. We derived a gene co-expression network from an independent human brain developmental transcriptome and detected a convergence of the differentially expressed lncRNAs and known ASD risk genes into specific co-expression modules. Co-expression network analysis facilitates the discovery of associations between previously uncharacterized lncRNAs with known ASD risk genes, affected molecular pathways and at-risk developmental time points. In addition, we show that some of these lncRNAs have a high degree of overlap with major CNVs detected in ASD genetic studies. By utilizing this integrative approach comprised of differential expression analysis in affected tissues and connectivity metrics from a developmental co-expression network, we have prioritized a set of candidate ASD-associated lncRNAs. The identification of lncRNAs as novel ASD susceptibility genes could help explain the genetic pathogenesis of ASD.
基因研究已经确定了许多自闭症谱系障碍(ASD)的风险位点,尽管大多数情况下的致病因素仍然未知。目前,已知的ASD风险基因都是蛋白质编码基因;然而,人类中的绝大多数转录本是非编码RNA(ncRNA),它们不编码蛋白质。最近,长链非编码RNA(lncRNA)被证明在人类大脑中高度表达,并且对正常大脑发育至关重要。我们构建了一个计算流程,用于整合各种基因组数据集,以识别与ASD相关的lncRNA。该流程利用受影响组织中的差异基因表达模式,结合组织匹配的未受影响样本中的基因共表达网络。我们分析了来自ASD病例和对照的大脑皮质组织的RNA测序数据,以识别在ASD中差异表达的lncRNA。我们从一个独立的人类大脑发育转录组中推导了一个基因共表达网络,并检测到差异表达的lncRNA和已知的ASD风险基因汇聚到特定的共表达模块中。共表达网络分析有助于发现以前未表征的lncRNA与已知的ASD风险基因、受影响的分子途径和风险发育时间点之间的关联。此外,我们表明其中一些lncRNA与ASD基因研究中检测到的主要拷贝数变异(CNV)有高度重叠。通过利用这种由受影响组织中的差异表达分析和发育共表达网络的连接性指标组成的综合方法,我们对一组候选的ASD相关lncRNA进行了优先级排序。将lncRNA鉴定为新的ASD易感基因有助于解释ASD的遗传发病机制。