Kimura Yasumasa, Ono Yoshimasa, Katayama Kotoe, Imoto Seiya
DX Drug Discovery Department, Daiichi Sankyo RD Novare Co., Ltd., Edogawa-ku, Tokyo 134-8630, Japan.
Division of Health Medical Intelligence, Human Genome Center, Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo 108-8639, Japan.
Bioinform Adv. 2024 Aug 20;4(1):vbae118. doi: 10.1093/bioadv/vbae118. eCollection 2024.
Enhancers play critical roles in cell-type-specific transcriptional control. Despite the identification of thousands of candidate enhancers, unravelling their regulatory relationships with their target genes remains challenging. Therefore, computational approaches are needed to accurately infer enhancer-gene regulatory relationships.
In this study, we propose a new method, IVEA, that predicts enhancer-gene regulatory interactions by estimating promoter and enhancer activities. Its statistical model is based on the gene regulatory mechanism of transcriptional bursting, which is characterized by burst size and frequency controlled by promoters and enhancers, respectively. Using transcriptional readouts, chromatin accessibility, and chromatin contact data as inputs, promoter and enhancer activities were estimated using variational Bayesian inference, and the contribution of each enhancer-promoter pair to target gene transcription was calculated. Our analysis demonstrates that the proposed method can achieve high prediction accuracy and provide biologically relevant enhancer-gene regulatory interactions.
The IVEA code is available on GitHub at https://github.com/yasumasak/ivea. The publicly available datasets used in this study are described in Supplementary Table S4.
增强子在细胞类型特异性转录调控中发挥着关键作用。尽管已鉴定出数千个候选增强子,但阐明它们与其靶基因的调控关系仍然具有挑战性。因此,需要计算方法来准确推断增强子-基因调控关系。
在本研究中,我们提出了一种新方法IVEA,通过估计启动子和增强子活性来预测增强子-基因调控相互作用。其统计模型基于转录爆发的基因调控机制,其特征分别是由启动子和增强子控制的爆发大小和频率。使用转录读数、染色质可及性和染色质接触数据作为输入,使用变分贝叶斯推理估计启动子和增强子活性,并计算每个增强子-启动子对靶基因转录的贡献。我们的分析表明,所提出的方法可以实现高预测准确性,并提供生物学相关的增强子-基因调控相互作用。
IVEA代码可在GitHub上获取,网址为https://github.com/yasumasak/ivea。本研究中使用的公开可用数据集在补充表S4中进行了描述。