Suppr超能文献

TF-EPI:一种基于Transformer的可解释增强子-启动子相互作用检测方法。

TF-EPI: an interpretable enhancer-promoter interaction detection method based on Transformer.

作者信息

Liu Bowen, Zhang Weihang, Zeng Xin, Loza Martin, Park Sung-Joon, Nakai Kenta

机构信息

Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Tokyo, Japan.

Human Genome Center, Institute of Medical Science, University of Tokyo, Tokyo, Japan.

出版信息

Front Genet. 2024 Aug 9;15:1444459. doi: 10.3389/fgene.2024.1444459. eCollection 2024.

Abstract

The detection of enhancer-promoter interactions (EPIs) is crucial for understanding gene expression regulation, disease mechanisms, and more. In this study, we developed TF-EPI, a deep learning model based on Transformer designed to detect these interactions solely from DNA sequences. The performance of TF-EPI surpassed that of other state-of-the-art methods on multiple benchmark datasets. Importantly, by utilizing the attention mechanism of the Transformer, we identified distinct cell type-specific motifs and sequences in enhancers and promoters, which were validated against databases such as JASPAR and UniBind, highlighting the potential of our method in discovering new biological insights. Moreover, our analysis of the transcription factors (TFs) corresponding to these motifs and short sequence pairs revealed the heterogeneity and commonality of gene regulatory mechanisms and demonstrated the ability to identify TFs relevant to the source information of the cell line. Finally, the introduction of transfer learning can mitigate the challenges posed by cell type-specific gene regulation, yielding enhanced accuracy in cross-cell line EPI detection. Overall, our work unveils important sequence information for the investigation of enhancer-promoter pairs based on the attention mechanism of the Transformer, providing an important milestone in the investigation of cis-regulatory grammar.

摘要

增强子-启动子相互作用(EPI)的检测对于理解基因表达调控、疾病机制等至关重要。在本研究中,我们开发了TF-EPI,这是一种基于Transformer的深度学习模型,旨在仅从DNA序列中检测这些相互作用。TF-EPI在多个基准数据集上的性能超过了其他现有最先进方法。重要的是,通过利用Transformer的注意力机制,我们在增强子和启动子中识别出了不同的细胞类型特异性基序和序列,并针对JASPAR和UniBind等数据库进行了验证,突出了我们方法在发现新的生物学见解方面的潜力。此外,我们对与这些基序和短序列对相对应的转录因子(TF)的分析揭示了基因调控机制的异质性和共性,并证明了识别与细胞系来源信息相关的TF的能力。最后,迁移学习的引入可以减轻细胞类型特异性基因调控带来的挑战,在跨细胞系EPI检测中提高准确性。总体而言,我们的工作基于Transformer的注意力机制揭示了用于研究增强子-启动子对的重要序列信息,为顺式调控语法的研究提供了一个重要的里程碑。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/683b/11341371/ac40cd7b3351/fgene-15-1444459-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验