Park Hyun-Seok
Bioinformatics Laboratory, ELTEC College of Engineering, Ewha Womans University, Seoul 03760, Korea.
Center for Convergence Research of Advanced Technologies, Ewha Womans University, Seoul 03760, Korea.
Genomics Inform. 2018 Sep;16(3):65-70. doi: 10.5808/GI.2018.16.3.65. Epub 2018 Sep 30.
The non-coding DNA in eukaryotic genomes encodes a language which programs chromatin accessibility, transcription factor binding, and various other activities. The objective of this short report was to determine the impact of primary DNA sequence on the epigenomic landscape across 200-base pair genomic units by integrating nine publicly available ChromHMM Browser Extensible Data files of the Encyclopedia of DNA Elements (ENCODE) project. The nucleotide frequency profiles of nine chromatin annotations with the units of 200 bp were analyzed and integrative Markov chains were built to detect the Markov properties of the DNA sequences in some of the active chromatin states of different ChromHMM regions. Our aim was to identify the possible relationship between DNA sequences and the newly built chromatin states based on the integrated ChromHMM datasets of different cells and tissue types.
真核生物基因组中的非编码DNA编码一种语言,该语言对染色质可及性、转录因子结合及各种其他活动进行编程。本简短报告的目的是通过整合DNA元件百科全书(ENCODE)项目的九个公开可用的ChromHMM浏览器可扩展数据文件,确定初级DNA序列对200碱基对基因组单元上表观基因组格局的影响。分析了以200 bp为单位的九种染色质注释的核苷酸频率谱,并构建了整合马尔可夫链,以检测不同ChromHMM区域某些活性染色质状态下DNA序列的马尔可夫特性。我们的目的是基于不同细胞和组织类型的整合ChromHMM数据集,确定DNA序列与新构建的染色质状态之间的可能关系。