Suppr超能文献

PTFSpot:在转录因子及其结合区域上进行深度协同学习,在植物中实现了完美的通用性。

PTFSpot: deep co-learning on transcription factors and their binding regions attains impeccable universality in plants.

机构信息

Studio of Computational Biology & Bioinformatics, The Himalayan Centre for High-throughput Computational Biology, (HiCHiCoB, A BIC supported by DBT, India), Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology (CSIR-IHBT), Palampur, Himachal Pradesh 176061, India.

Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, Uttar Pradesh 201002, India.

出版信息

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae324.

Abstract

Unlike animals, variability in transcription factors (TFs) and their binding regions (TFBRs) across the plants species is a major problem that most of the existing TFBR finding software fail to tackle, rendering them hardly of any use. This limitation has resulted into underdevelopment of plant regulatory research and rampant use of Arabidopsis-like model species, generating misleading results. Here, we report a revolutionary transformers-based deep-learning approach, PTFSpot, which learns from TF structures and their binding regions' co-variability to bring a universal TF-DNA interaction model to detect TFBR with complete freedom from TF and species-specific models' limitations. During a series of extensive benchmarking studies over multiple experimentally validated data, it not only outperformed the existing software by >30% lead but also delivered consistently >90% accuracy even for those species and TF families that were never encountered during the model-building process. PTFSpot makes it possible now to accurately annotate TFBRs across any plant genome even in the total lack of any TF information, completely free from the bottlenecks of species and TF-specific models.

摘要

与动物不同,转录因子(TF)及其结合区域(TFBR)在植物物种间的变异性是一个主要问题,大多数现有的 TFBR 发现软件都无法解决,这使得它们几乎毫无用处。这一限制导致植物调控研究的发展不足和拟南芥等模式物种的广泛使用,产生了误导性的结果。在这里,我们报告了一种基于转换器的深度学习方法 PTFSpot,它从 TF 结构及其结合区域的共变异性中学习,从而带来一种通用的 TF-DNA 相互作用模型,可以完全不受 TF 和物种特异性模型限制来检测 TFBR。在一系列广泛的基准测试研究中,它不仅以超过 30%的优势超过了现有的软件,而且即使对于那些在模型构建过程中从未遇到过的物种和 TF 家族,也能始终保持超过 90%的准确率。PTFSpot 使得现在可以在完全缺乏任何 TF 信息的情况下,甚至在任何植物基因组中准确注释 TFBR,完全摆脱物种和 TF 特异性模型的瓶颈。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fcf5/11250369/a328a2b7916b/bbae324f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验