Suppr超能文献

使用细胞状态感知深度学习来解释等位基因特异性染色质可及性。

Interpretation of allele-specific chromatin accessibility using cell state-aware deep learning.

机构信息

VIB-KU Leuven Center for Brain and Disease Research, 3000 Leuven, Belgium.

KU Leuven, Department of Human Genetics KU Leuven, 3000 Leuven, Belgium.

出版信息

Genome Res. 2021 Jun;31(6):1082-1096. doi: 10.1101/gr.260851.120. Epub 2021 Apr 8.

Abstract

Genomic sequence variation within enhancers and promoters can have a significant impact on the cellular state and phenotype. However, sifting through the millions of candidate variants in a personal genome or a cancer genome, to identify those that impact -regulatory function, remains a major challenge. Interpretation of noncoding genome variation benefits from explainable artificial intelligence to predict and interpret the impact of a mutation on gene regulation. Here we generate phased whole genomes with matched chromatin accessibility, histone modifications, and gene expression for 10 melanoma cell lines. We find that training a specialized deep learning model, called DeepMEL2, on melanoma chromatin accessibility data can capture the various regulatory programs of the melanocytic and mesenchymal-like melanoma cell states. This model outperforms motif-based variant scoring, as well as more generic deep learning models. We detect hundreds to thousands of allele-specific chromatin accessibility variants (ASCAVs) in each melanoma genome, of which 15%-20% can be explained by gains or losses of transcription factor binding sites. A considerable fraction of ASCAVs are caused by changes in AP-1 binding, as confirmed by matched ChIP-seq data to identify allele-specific binding of JUN and FOSL1. Finally, by augmenting the DeepMEL2 model with ChIP-seq data for GABPA, the TERT promoter mutation, as well as additional ETS motif gains, can be identified with high confidence. In conclusion, we present a new integrative genomics approach and a deep learning model to identify and interpret functional enhancer mutations with allelic imbalance of chromatin accessibility and gene expression.

摘要

增强子和启动子中的基因组序列变异会对细胞状态和表型产生重大影响。然而,在个人基因组或癌症基因组中筛选数以百万计的候选变体,以确定那些影响调节功能的变体仍然是一个主要挑战。非编码基因组变异的解释受益于可解释的人工智能,以预测和解释突变对基因调控的影响。在这里,我们为 10 个黑色素瘤细胞系生成了具有匹配染色质可及性、组蛋白修饰和基因表达的分相全基因组。我们发现,在黑色素瘤染色质可及性数据上训练专门的深度学习模型 DeepMEL2 可以捕获黑色素细胞和间充质样黑色素瘤细胞状态的各种调节程序。该模型优于基于基序的变体评分以及更通用的深度学习模型。我们在每个黑色素瘤基因组中检测到数百到数千个等位基因特异性染色质可及性变体 (ASCAV),其中 15%-20%可以通过转录因子结合位点的获得或丢失来解释。相当一部分 ASCAV 是由 AP-1 结合的变化引起的,这一点通过匹配的 ChIP-seq 数据得到了证实,这些数据可以识别 JUN 和 FOSL1 的等位基因特异性结合。最后,通过将 DeepMEL2 模型与 GABPA 的 ChIP-seq 数据、TERT 启动子突变以及其他 ETS 基序增益相结合,可以高度置信地识别这些数据。总之,我们提出了一种新的整合基因组学方法和深度学习模型,用于识别和解释具有染色质可及性和基因表达等位基因失衡的功能增强子突变。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7cb3/8168584/9f06df4540de/1082f01.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验