基因组层:基于序列的表观基因组模拟。
GenomicLayers: sequence-based simulation of epi-genomes.
作者信息
Gerrard Dave T
机构信息
Division of Evolution, Infection and Genomics, School of Biological Sciences, Faculty of Biology, Medicine & Health, The University of Manchester, Stopford Building, Oxford Road, Manchester, M13 9PT, UK.
出版信息
BMC Bioinformatics. 2025 Aug 4;26(1):205. doi: 10.1186/s12859-025-06224-y.
BACKGROUND
Cellular development and differentiation in Eukaryotes depends upon sequential gene regulatory decisions that allow a single genome to encode many hundreds of distinct cellular phenotypes. Decisions are stored in the regulatory state of each cell, an important part of which is the epi-genome-the collection of proteins, RNA and their specific associations with the genome. Additionally, further cellular responses are, in part, determined by this regulatory state. To date, models of regulatory state have failed to include the contingency of incoming regulatory signals on the current epi-genetic state and none have done so at the whole-genome level.
RESULTS
Here we introduce GenomicLayers, a new R package to run rules-based simulations of epigenetic state changes genome-wide in Eukaryotes. Simulations model the accumulation of changes to genome-wide layers by user-specified binding factors. As a first exemplar, we show two versions of a simple model of the recruitment and spreading of epigenetic marks near telomeres in the yeast Saccharomyces cerevisiae. By combining the output from 100 runs of the simulation, we generate whole genome predictions of epigenetic state at 1 bp resolution. The example yeast models are included within a 'vignette' with the GenomicLayers package, which is available at https://github.com/davetgerrard/GenomicLayers . To demonstrate the use of GenomicLayers on the full human reference genome (hg38), we show the results from parameter refinement on a simplistic model of the action of pluripotency factors against a self-spreading repressor seeded at CpG islands. The human genome model is included in supplementary information as an R script.
CONCLUSIONS
GenomicLayers enables scientists working on diverse eukaryotic organisms to test models of gene regulation in silico. Applications include epigenetic silencing, activation by combinatorial binding of transcription factors and the sink effects caused by down-regulation of components of epigenetic regulators. The software is intended to be used to parameterise, refine and combine models and thereby capitalise on data from the thousands of studies of Eukaryotic epigenomes.
背景
真核生物中的细胞发育和分化依赖于一系列基因调控决策,这些决策使得单个基因组能够编码数百种不同的细胞表型。这些决策存储在每个细胞的调控状态中,其中一个重要部分是表观基因组——蛋白质、RNA及其与基因组的特定关联的集合。此外,进一步的细胞反应部分由这种调控状态决定。迄今为止,调控状态模型未能考虑到传入调控信号对当前表观遗传状态的偶然性,并且没有一个模型在全基因组水平上做到这一点。
结果
在这里,我们引入了GenomicLayers,这是一个新的R包,用于在真核生物全基因组范围内运行基于规则的表观遗传状态变化模拟。模拟通过用户指定的结合因子对全基因组层面的变化积累进行建模。作为第一个示例,我们展示了酿酒酵母端粒附近表观遗传标记招募和扩散的简单模型的两个版本。通过结合模拟100次运行的输出,我们生成了1bp分辨率的全基因组表观遗传状态预测。示例酵母模型包含在GenomicLayers包的“vignette”中,可在https://github.com/davetgerrard/GenomicLayers获取。为了展示GenomicLayers在完整人类参考基因组(hg38)上的使用,我们展示了对多能性因子作用的简单模型进行参数优化的结果,该模型针对在CpG岛播种的自我扩散阻遏物。人类基因组模型作为R脚本包含在补充信息中。
结论
GenomicLayers使研究不同真核生物的科学家能够在计算机上测试基因调控模型。应用包括表观遗传沉默、转录因子组合结合的激活以及表观遗传调节因子成分下调引起的汇聚效应。该软件旨在用于参数化、优化和组合模型,从而利用来自数千项真核生物表观基因组研究的数据。