Suppr超能文献

基于神经网络的模型可有效预测临床 ATAC-seq 样本中的增强子。

A neural network based model effectively predicts enhancers from clinical ATAC-seq samples.

机构信息

The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.

Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, 06030, USA.

出版信息

Sci Rep. 2018 Oct 30;8(1):16048. doi: 10.1038/s41598-018-34420-9.

Abstract

Enhancers are cis-acting sequences that regulate transcription rates of their target genes in a cell-specific manner and harbor disease-associated sequence variants in cognate cell types. Many complex diseases are associated with enhancer malfunction, necessitating the discovery and study of enhancers from clinical samples. Assay for Transposase Accessible Chromatin (ATAC-seq) technology can interrogate chromatin accessibility from small cell numbers and facilitate studying enhancers in pathologies. However, on average, ~35% of open chromatin regions (OCRs) from ATAC-seq samples map to enhancers. We developed a neural network-based model, Predicting Enhancers from ATAC-Seq data (PEAS), to effectively infer enhancers from clinical ATAC-seq samples by extracting ATAC-seq data features and integrating these with sequence-related features (e.g., GC ratio). PEAS recapitulated ChromHMM-defined enhancers in CD14+ monocytes, CD4+ T cells, GM12878, peripheral blood mononuclear cells, and pancreatic islets. PEAS models trained on these 5 cell types effectively predicted enhancers in four cell types that are not used in model training (EndoC-βH1, naïve CD8+ T, MCF7, and K562 cells). Finally, PEAS inferred individual-specific enhancers from 19 islet ATAC-seq samples and revealed variability in enhancer activity across individuals, including those driven by genetic differences. PEAS is an easy-to-use tool developed to study enhancers in pathologies by taking advantage of the increasing number of clinical epigenomes.

摘要

增强子是顺式作用序列,以细胞特异性的方式调节其靶基因的转录速率,并在同源细胞类型中含有与疾病相关的序列变异。许多复杂疾病都与增强子功能障碍有关,因此需要从临床样本中发现和研究增强子。转座酶可及染色质(ATAC-seq)技术可以从小细胞数量中检测染色质可及性,并促进研究病理学中的增强子。然而,平均而言,来自 ATAC-seq 样本的开放染色质区域(OCR)中只有约 35%可以映射到增强子上。我们开发了一种基于神经网络的模型,即从 ATAC-seq 数据预测增强子(PEAS),通过提取 ATAC-seq 数据特征并将其与序列相关特征(如 GC 比)相结合,从临床 ATAC-seq 样本中有效地推断增强子。PEAS 在 CD14+单核细胞、CD4+T 细胞、GM12878、外周血单核细胞和胰岛中重现了 ChromHMM 定义的增强子。在这 5 种细胞类型上训练的 PEAS 模型可以有效地预测不在模型训练中使用的 4 种细胞类型(EndoC-βH1、naïve CD8+T、MCF7 和 K562 细胞)中的增强子。最后,PEAS 从 19 个胰岛 ATAC-seq 样本中推断出个体特异性增强子,并揭示了个体之间增强子活性的可变性,包括由遗传差异驱动的增强子活性。PEAS 是一种易于使用的工具,通过利用越来越多的临床表观基因组学数据,研究病理学中的增强子。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f75b/6207744/184378aad2f1/41598_2018_34420_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验