Suppr超能文献

密码本的视角:未表征的人类转录因子的序列特异性

Perspectives on Codebook: sequence specificity of uncharacterized human transcription factors.

作者信息

Jolma Arttu, Laverty Kaitlin U, Fathi Ali, Yang Ally W H, Yellan Isaac, Vorontsov Ilya E, Inukai Sachi, Kribelbauer-Swietek Judith F, Gralak Antoni J, Razavi Rozita, Albu Mihai, Brechalov Alexander, Patel Zain M, Nozdrin Vladimir, Meshcheryakov Georgy, Kozin Ivan, Abramov Sergey, Boytsov Alexandr, Fornes Oriol, Makeev Vsevolod J, Grau Jan, Grosse Ivo, Bucher Philipp, Deplancke Bart, Kulakovskiy Ivan V, Hughes Timothy R

机构信息

Donnelly Centre, University of Toronto, Toronto, ON M5S 3E1, Canada.

Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA.

出版信息

bioRxiv. 2024 Nov 12:2024.11.11.622097. doi: 10.1101/2024.11.11.622097.

Abstract

We describe an effort ("Codebook") to determine the sequence specificity of 332 putative and largely uncharacterized human transcription factors (TFs), as well as 61 control TFs. Nearly 5,000 independent experiments across multiple and assays produced motifs for just over half of the putative TFs analyzed (177, or 53%), of which most are unique to a single TF. The data highlight the extensive contribution of transposable elements to TF evolution, both in and , and identify tens of thousands of conserved, base-level binding sites in the human genome. The use of multiple assays provides an unprecedented opportunity to benchmark and analyze TF sequence specificity, function, and evolution, as further explored in accompanying manuscripts. 1,421 human TFs are now associated with a DNA binding motif. Extrapolation from the Codebook benchmarking, however, suggests that many of the currently known binding motifs for well-studied TFs may inaccurately describe the TF's true sequence preferences.

摘要

我们描述了一项工作(“密码本”),以确定332种假定的、大多未被表征的人类转录因子(TFs)以及61种对照TFs的序列特异性。通过多个实验和检测方法进行的近5000次独立实验,为所分析的略超过一半的假定TFs(177个,即53%)生成了基序,其中大多数是单个TF所特有的。数据突出了转座元件在TF进化中的广泛贡献,包括在[具体方面1]和[具体方面2],并在人类基因组中识别出数以万计的保守的、碱基水平的结合位点。多种检测方法的使用为基准测试和分析TF序列特异性、功能及进化提供了前所未有的机会,正如随附手稿中进一步探讨的那样。现在有1421种人类TFs与一个DNA结合基序相关联。然而,从“密码本”基准测试推断,许多目前已知的、经过充分研究的TFs的结合基序可能无法准确描述TF的真实序列偏好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01e8/11601247/267da3c70f65/nihpp-2024.11.11.622097v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验