Suppr超能文献

通过大语言模型预测TET和DNMT3基因敲除突变体中的差异甲基化胞嘧啶

Predicting Differentially Methylated Cytosines in TET and DNMT3 Knockout Mutants via a Large Language Model.

作者信息

Sereshki Saleh, Lonardi Stefano

机构信息

Department of Computer Science and Engineering, University of California, Riverside, 900 University Ave, Riverside, 92521, CA, United States.

出版信息

bioRxiv. 2024 Sep 4:2024.05.02.592257. doi: 10.1101/2024.05.02.592257.

Abstract

DNA cytosine methylation is an epigenetic marker which regulates many cellular processes. Mammalian genomes typically maintain consistent methylation patterns over time, except in specific regulatory regions like promoters and certain types of enhancers. The dynamics of DNA methylation is controlled by a complex cellular machinery, in which the enzymes DNMT3 and TET play a major role. This study explores the identification of differentially methylated cytosines (DMCs) in TET and DNMT3 knockout mutants in mice and human embryonic stem cells. We investigate (i) whether a large language model can be trained to recognize DMCs in human and mouse from the sequence surrounding the cytosine of interest, (ii) whether a classifier trained on human knockout data can predict DMCs in the mouse genome (and vice versa), (iii) whether a classifier trained on DNMT3 knockout can predict DMCs for TET knockout (and vice versa). Our study identifies statistically significant motifs associated with the prediction of DMCs each mutant, casting a new light on the understanding of DNA methylation dynamics in stem cells. Our software tool is available at https://github.com/ucrbioinfo/dmc_prediction.

摘要

DNA胞嘧啶甲基化是一种表观遗传标记,可调节许多细胞过程。除了在启动子和某些类型的增强子等特定调控区域外,哺乳动物基因组通常会随着时间的推移保持一致的甲基化模式。DNA甲基化的动态变化由复杂的细胞机制控制,其中DNMT3和TET酶发挥着主要作用。本研究探索了在小鼠和人类胚胎干细胞的TET和DNMT3基因敲除突变体中差异甲基化胞嘧啶(DMC)的识别。我们研究了:(i)是否可以训练一个大语言模型,从感兴趣的胞嘧啶周围的序列中识别出人类和小鼠中的DMC;(ii)在人类基因敲除数据上训练的分类器是否可以预测小鼠基因组中的DMC(反之亦然);(iii)在DNMT3基因敲除上训练的分类器是否可以预测TET基因敲除中的DMC(反之亦然)。我们的研究确定了与每个突变体中DMC预测相关的具有统计学意义的基序,为理解干细胞中的DNA甲基化动态提供了新的视角。我们的软件工具可在https://github.com/ucrbioinfo/dmc_prediction上获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/15a6/11398415/e98d6180ca09/nihpp-2024.05.02.592257v2-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验