Suppr超能文献

HEAP:一种基于任务自适应的可解释深度学习框架,用于增强子活性预测。

HEAP: a task adaptive-based explainable deep learning framework for enhancer activity prediction.

作者信息

Liu Yuhang, Wang Zixuan, Yuan Hao, Zhu Guiquan, Zhang Yongqing

机构信息

School of Computer Science, Chengdu University of Information Technology, 610225, Chengdu, China.

College of Electronics and Information Engieering, Sichuan University, 610065, Chengdu, China.

出版信息

Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad286.

Abstract

Enhancers are crucial cis-regulatory elements that control gene expression in a cell-type-specific manner. Despite extensive genetic and computational studies, accurately predicting enhancer activity in different cell types remains a challenge, and the grammar of enhancers is still poorly understood. Here, we present HEAP (high-resolution enhancer activity prediction), an explainable deep learning framework for predicting enhancers and exploring enhancer grammar. The framework includes three modules that use grammar-based reasoning for enhancer prediction. The algorithm can incorporate DNA sequences and epigenetic modifications to obtain better accuracy. We use a novel two-step multi-task learning method, task adaptive parameter sharing (TAPS), to efficiently predict enhancers in different cell types. We first train a shared model with all cell-type datasets. Then we adapt to specific tasks by adding several task-specific subset layers. Experiments demonstrate that HEAP outperforms published methods and showcases the effectiveness of the TAPS, especially for those with limited training samples. Notably, the explainable framework HEAP utilizes post-hoc interpretation to provide insights into the prediction mechanisms from three perspectives: data, model architecture and algorithm, leading to a better understanding of model decisions and enhancer grammar. To the best of our knowledge, HEAP will be a valuable tool for insight into the complex mechanisms of enhancer activity.

摘要

增强子是至关重要的顺式调控元件,以细胞类型特异性方式控制基因表达。尽管进行了广泛的遗传学和计算研究,但准确预测不同细胞类型中的增强子活性仍然是一项挑战,而且增强子的语法仍知之甚少。在此,我们提出了HEAP(高分辨率增强子活性预测),这是一种用于预测增强子和探索增强子语法的可解释深度学习框架。该框架包括三个使用基于语法的推理进行增强子预测的模块。该算法可以整合DNA序列和表观遗传修饰以获得更高的准确性。我们使用一种新颖的两步多任务学习方法,即任务自适应参数共享(TAPS),来高效预测不同细胞类型中的增强子。我们首先使用所有细胞类型数据集训练一个共享模型。然后通过添加几个特定于任务的子集层来适应特定任务。实验表明,HEAP优于已发表的方法,并展示了TAPS的有效性,特别是对于那些训练样本有限的情况。值得注意的是,可解释框架HEAP利用事后解释从数据、模型架构和算法三个角度深入了解预测机制,从而更好地理解模型决策和增强子语法。据我们所知,HEAP将成为洞察增强子活性复杂机制的宝贵工具。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验