Suppr超能文献

EpiGePT:一种用于特定背景人类表观基因组学的基于预训练Transformer的语言模型。

EpiGePT: a pretrained transformer-based language model for context-specific human epigenomics.

作者信息

Gao Zijing, Liu Qiao, Zeng Wanwen, Jiang Rui, Wong Wing Hung

机构信息

Ministry of Education Key Laboratory of Bioinformatics, Bioinformatics Division at the Beijing National Research Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, 100084, China.

Department of Statistics, Stanford University, CA, Stanford, 94305, USA.

出版信息

Genome Biol. 2024 Dec 18;25(1):310. doi: 10.1186/s13059-024-03449-7.

Abstract

The inherent similarities between natural language and biological sequences have inspired the use of large language models in genomics, but current models struggle to incorporate chromatin interactions or predict in unseen cellular contexts. To address this, we propose EpiGePT, a transformer-based model designed for predicting context-specific human epigenomic signals. By incorporating transcription factor activities and 3D genome interactions, EpiGePT outperforms existing methods in epigenomic signal prediction tasks, especially in cell-type-specific long-range interaction predictions and genetic variant impacts, advancing our understanding of gene regulation. A free online prediction service is available at http://health.tsinghua.edu.cn/epigept .

摘要

自然语言与生物序列之间的内在相似性激发了人们在基因组学中使用大语言模型的想法,但目前的模型在整合染色质相互作用或预测未知细胞环境方面存在困难。为了解决这一问题,我们提出了EpiGePT,这是一种基于Transformer的模型,旨在预测特定背景下的人类表观基因组信号。通过整合转录因子活性和三维基因组相互作用,EpiGePT在表观基因组信号预测任务中优于现有方法,尤其是在细胞类型特异性的长程相互作用预测和基因变异影响方面,加深了我们对基因调控的理解。可通过http://health.tsinghua.edu.cn/epigept获得免费的在线预测服务。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82db/11657395/441076a392ef/13059_2024_3449_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验