Suppr超能文献

深度TFBS:利用深度多任务和迁移学习改进转录因子结合的种内和跨物种预测。

deepTFBS: Improving within- and Cross-Species Prediction of Transcription Factor Binding Using Deep Multi-Task and Transfer Learning.

作者信息

Zhai Jingjing, Zhang Yuzhou, Zhang Chujun, Yin Xiaotong, Song Minggui, Tang Chenglong, Ding Pengjun, Li Zenglin, Ma Chuang

机构信息

State Key Laboratory for Crop Stress Resistance and High-Efficiency Production, Center of Bioinformatics, College of Life Sciences, Northwest A&F University, Yangling, Shaanxi, 712100, China.

Key Laboratory of Biology and Genetics Improvement of Maize in Arid Area of Northwest Region, Ministry of Agriculture, Northwest A&F University, Yangling, Shaanxi, 712100, China.

出版信息

Adv Sci (Weinh). 2025 Aug;12(30):e03135. doi: 10.1002/advs.202503135. Epub 2025 May 24.

Abstract

The precise prediction of transcription factor binding sites (TFBSs) is crucial in understanding gene regulation. In this study, deepTFBS, a comprehensive deep learning (DL) framework that builds a robust DNA language model of TF binding grammar for accurately predicting TFBSs within and across plant species is presented. Taking advantages of multi-task DL and transfer learning, deepTFBS is capable of leveraging the knowledge learned from large-scale TF binding profiles to enhance the prediction of TFBSs under small-sample training and cross-species prediction tasks. When tested using available information on 359 Arabidopsis TFs, deepTFBS outperformed previously described prediction strategies, including position weight matrix, deepSEA and DanQ, with a 244.49%, 49.15%, and 23.32% improvement of the area under the precision-recall curve (PRAUC), respectively. Further cross-species prediction of TFBS in wheat showed that deepTFBS yielded a significant PRAUC improvement of 30.6% over these three baseline models. deepTFBS can also utilize information from gene conservation and binding motifs, enabling efficient TFBS prediction in species where experimental data availability is limited. A case study, focusing on the WUSCHEL (WUS) transcription factor, illustrated the potential use of deepTFBS in cross-species applications, in our example between Arabidopsis and wheat. deepTFBS is publically available at https://github.com/cma2015/deepTFBS.

摘要

转录因子结合位点(TFBSs)的精确预测对于理解基因调控至关重要。在本研究中,我们提出了deepTFBS,这是一个全面的深度学习(DL)框架,它构建了一个强大的TF结合语法DNA语言模型,用于准确预测植物物种内和跨物种的TFBSs。利用多任务深度学习和迁移学习,deepTFBS能够利用从大规模TF结合谱中学到的知识,在小样本训练和跨物种预测任务中增强TFBSs的预测。当使用359个拟南芥TF的可用信息进行测试时,deepTFBS优于先前描述的预测策略,包括位置权重矩阵、deepSEA和DanQ,精确召回率曲线下面积(PRAUC)分别提高了244.49%、49.15%和23.32%。对小麦中TFBS的进一步跨物种预测表明,deepTFBS比这三个基线模型的PRAUC有显著提高,提高了30.6%。deepTFBS还可以利用基因保守性和结合基序的信息,在实验数据有限的物种中实现高效的TFBS预测。一个以WUSCHEL(WUS)转录因子为重点的案例研究,说明了deepTFBS在跨物种应用中的潜在用途,在我们的例子中是拟南芥和小麦之间。deepTFBS可在https://github.com/cma2015/deepTFBS上公开获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f846/12376555/846b0685b0cc/ADVS-12-e03135-g002.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验