Benner Philipp, Vingron Martin
Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestraße 73, 14195 Berlin, Germany.
NAR Genom Bioinform. 2021 Oct 27;3(4):lqab095. doi: 10.1093/nargab/lqab095. eCollection 2021 Dec.
Recent efforts to measure epigenetic marks across a wide variety of different cell types and tissues provide insights into the cell type-specific regulatory landscape. We use these data to study whether there exists a correlate of epigenetic signals in the DNA sequence of enhancers and explore with computational methods to what degree such sequence patterns can be used to predict cell type-specific regulatory activity. By constructing classifiers that predict in which tissues enhancers are active, we are able to identify sequence features that might be recognized by the cell in order to regulate gene expression. While classification performances vary greatly between tissues, we show examples where our classifiers correctly predict tissue-specific regulation from sequence alone. We also show that many of the informative patterns indeed harbor transcription factor footprints.
近期在多种不同细胞类型和组织中测量表观遗传标记的努力,为深入了解细胞类型特异性调控格局提供了线索。我们利用这些数据来研究增强子的DNA序列中是否存在表观遗传信号的相关物,并通过计算方法探索这种序列模式在多大程度上可用于预测细胞类型特异性调控活性。通过构建预测增强子在哪些组织中活跃的分类器,我们能够识别细胞可能识别的序列特征,以便调控基因表达。虽然不同组织之间的分类性能差异很大,但我们展示了一些例子,其中我们的分类器仅根据序列就能正确预测组织特异性调控。我们还表明,许多信息性模式确实含有转录因子足迹。