Suppr超能文献

基因核苷酸组成中表达调控的探究指南。

Probing instructions for expression regulation in gene nucleotide compositions.

作者信息

Bessière Chloé, Taha May, Petitprez Florent, Vandel Jimmy, Marin Jean-Michel, Bréhélin Laurent, Lèbre Sophie, Lecellier Charles-Henri

机构信息

IBC, Univ. Montpellier, CNRS, Montpellier, France.

Institut de Génétique Moléculaire de Montpellier, University of Montpellier, CNRS, Montpellier, France.

出版信息

PLoS Comput Biol. 2018 Jan 2;14(1):e1005921. doi: 10.1371/journal.pcbi.1005921. eCollection 2018 Jan.

Abstract

Gene expression is orchestrated by distinct regulatory regions to ensure a wide variety of cell types and functions. A challenge is to identify which regulatory regions are active, what are their associated features and how they work together in each cell type. Several approaches have tackled this problem by modeling gene expression based on epigenetic marks, with the ultimate goal of identifying driving regions and associated genomic variations that are clinically relevant in particular in precision medicine. However, these models rely on experimental data, which are limited to specific samples (even often to cell lines) and cannot be generated for all regulators and all patients. In addition, we show here that, although these approaches are accurate in predicting gene expression, inference of TF combinations from this type of models is not straightforward. Furthermore these methods are not designed to capture regulation instructions present at the sequence level, before the binding of regulators or the opening of the chromatin. Here, we probe sequence-level instructions for gene expression and develop a method to explain mRNA levels based solely on nucleotide features. Our method positions nucleotide composition as a critical component of gene expression. Moreover, our approach, able to rank regulatory regions according to their contribution, unveils a strong influence of the gene body sequence, in particular introns. We further provide evidence that the contribution of nucleotide content can be linked to co-regulations associated with genome 3D architecture and to associations of genes within topologically associated domains.

摘要

基因表达由不同的调控区域精心编排,以确保多种细胞类型和功能。一个挑战是确定哪些调控区域是活跃的,它们的相关特征是什么,以及它们在每种细胞类型中如何协同工作。有几种方法通过基于表观遗传标记对基因表达进行建模来解决这个问题,其最终目标是识别在精准医学中具有临床相关性的驱动区域和相关的基因组变异。然而,这些模型依赖于实验数据,这些数据仅限于特定样本(甚至通常是细胞系),并且无法针对所有调节因子和所有患者生成。此外,我们在此表明,尽管这些方法在预测基因表达方面很准确,但从这类模型推断转录因子组合并非易事。此外,这些方法并非旨在捕捉在调节因子结合或染色质开放之前序列水平上存在的调控指令。在这里,我们探究基因表达的序列水平指令,并开发一种仅基于核苷酸特征来解释mRNA水平的方法。我们的方法将核苷酸组成定位为基因表达的关键组成部分。此外,我们的方法能够根据调控区域的贡献对其进行排名,揭示了基因体序列,特别是内含子的强大影响。我们进一步提供证据表明,核苷酸含量的贡献可以与与基因组三维结构相关的共同调控以及拓扑相关结构域内基因的关联联系起来。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6dc8/5766238/73c610a78347/pcbi.1005921.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验