Suppr超能文献

ChIP-seq 与机器学习的整合揭示了黑素细胞中的增强子和一个具有预测性的调控序列词汇。

Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes.

机构信息

McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.

出版信息

Genome Res. 2012 Nov;22(11):2290-301. doi: 10.1101/gr.139360.112. Epub 2012 Sep 27.

Abstract

We take a comprehensive approach to the study of regulatory control of gene expression in melanocytes that proceeds from large-scale enhancer discovery facilitated by ChIP-seq; to rigorous validation in silico, in vitro, and in vivo; and finally to the use of machine learning to elucidate a regulatory vocabulary with genome-wide predictive power. We identify 2489 putative melanocyte enhancer loci in the mouse genome by ChIP-seq for EP300 and H3K4me1. We demonstrate that these putative enhancers are evolutionarily constrained, enriched for sequence motifs predicted to bind key melanocyte transcription factors, located near genes relevant to melanocyte biology, and capable of driving reporter gene expression in melanocytes in culture (86%; 43/50) and in transgenic zebrafish (70%; 7/10). Next, using the sequences of these putative enhancers as a training set for a supervised machine learning algorithm, we develop a vocabulary of 6-mers predictive of melanocyte enhancer function. Lastly, we demonstrate that this vocabulary has genome-wide predictive power in both the mouse and human genomes. This study provides deep insight into the regulation of gene expression in melanocytes and demonstrates a powerful approach to the investigation of regulatory sequences that can be applied to other cell types.

摘要

我们采取综合方法研究黑素细胞中基因表达的调控控制,该方法从大规模增强子发现开始,这得益于 ChIP-seq 的推动;通过计算机模拟、体外和体内进行严格验证;最后使用机器学习阐明具有全基因组预测能力的调控词汇。我们通过 ChIP-seq 为 EP300 和 H3K4me1 在小鼠基因组中鉴定了 2489 个潜在的黑素细胞增强子位点。我们证明这些潜在的增强子受到进化约束,富含预测结合关键黑素细胞转录因子的序列基序,位于与黑素细胞生物学相关的基因附近,并且能够在培养的黑素细胞(86%;43/50)和转基因斑马鱼(70%;7/10)中驱动报告基因表达。接下来,我们使用这些潜在增强子的序列作为监督机器学习算法的训练集,开发了一个 6 个碱基对的词汇,用于预测黑素细胞增强子功能。最后,我们证明该词汇在小鼠和人类基因组中具有全基因组的预测能力。这项研究深入了解了黑素细胞中基因表达的调控,并展示了一种强大的方法来研究调控序列,该方法可应用于其他细胞类型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f887/3483558/bf971fc3b975/2290fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验