Suppr超能文献

使用 CTCF-MP 预测 CTCF 介导的染色质环。

Predicting CTCF-mediated chromatin loops using CTCF-MP.

机构信息

Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.

出版信息

Bioinformatics. 2018 Jul 1;34(13):i133-i141. doi: 10.1093/bioinformatics/bty248.

Abstract

MOTIVATION

The three dimensional organization of chromosomes within the cell nucleus is highly regulated. It is known that CCCTC-binding factor (CTCF) is an important architectural protein to mediate long-range chromatin loops. Recent studies have shown that the majority of CTCF binding motif pairs at chromatin loop anchor regions are in convergent orientation. However, it remains unknown whether the genomic context at the sequence level can determine if a convergent CTCF motif pair is able to form a chromatin loop.

RESULTS

In this article, we directly ask whether and what sequence-based features (other than the motif itself) may be important to establish CTCF-mediated chromatin loops. We found that motif conservation measured by 'branch-of-origin' that accounts for motif turn-over in evolution is an important feature. We developed a new machine learning algorithm called CTCF-MP based on word2vec to demonstrate that sequence-based features alone have the capability to predict if a pair of convergent CTCF motifs would form a loop. Together with functional genomic signals from CTCF ChIP-seq and DNase-seq, CTCF-MP is able to make highly accurate predictions on whether a convergent CTCF motif pair would form a loop in a single cell type and also across different cell types. Our work represents an important step further to understand the sequence determinants that may guide the formation of complex chromatin architectures.

AVAILABILITY AND IMPLEMENTATION

The source code of CTCF-MP can be accessed at: https://github.com/ma-compbio/CTCF-MP.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

细胞核内染色体的三维组织高度受调控。已知结合因子(CTCF)是介导长距离染色质环的重要结构蛋白。最近的研究表明,染色质环锚定区域的大多数 CTCF 结合基序对呈会聚取向。然而,尚不清楚序列水平的基因组环境是否可以确定会聚的 CTCF 基序对是否能够形成染色质环。

结果

在本文中,我们直接询问序列基序(除基序本身外)是否以及哪些特征对于建立 CTCF 介导的染色质环可能很重要。我们发现,通过“起源分支”测量的基序保守性(进化中基序更替的指标)是一个重要特征。我们开发了一种名为 CTCF-MP 的新机器学习算法,该算法基于 word2vec 来证明仅基于序列的特征就具有预测一对会聚 CTCF 基序是否形成环的能力。结合 CTCF ChIP-seq 和 DNase-seq 的功能基因组信号,CTCF-MP 能够在单个细胞类型和不同细胞类型中高度准确地预测会聚 CTCF 基序对是否形成环。我们的工作代表了进一步理解可能指导复杂染色质结构形成的序列决定因素的重要一步。

可用性和实现

CTCF-MP 的源代码可在以下网址获取:https://github.com/ma-compbio/CTCF-MP。

补充信息

补充数据可在 Bioinformatics 在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4cfd/6022626/588d3793a43a/bty248f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验