Suppr超能文献

StereoGene:快速估计连续或区间特征数据的全基因组相关性。

StereoGene: rapid estimation of genome-wide correlation of continuous or interval feature data.

机构信息

Department of Bioengineering and Bioinformatics, Moscow State University, Moscow 119992, Russia.

Institute for Information Transmission Problems, RAS, Moscow 127994, Russia.

出版信息

Bioinformatics. 2017 Oct 15;33(20):3158-3165. doi: 10.1093/bioinformatics/btx379.

Abstract

MOTIVATION

Genomics features with similar genome-wide distributions are generally hypothesized to be functionally related, for example, colocalization of histones and transcription start sites indicate chromatin regulation of transcription factor activity. Therefore, statistical algorithms to perform spatial, genome-wide correlation among genomic features are required.

RESULTS

Here, we propose a method, StereoGene, that rapidly estimates genome-wide correlation among pairs of genomic features. These features may represent high-throughput data mapped to reference genome or sets of genomic annotations in that reference genome. StereoGene enables correlation of continuous data directly, avoiding the data binarization and subsequent data loss. Correlations are computed among neighboring genomic positions using kernel correlation. Representing the correlation as a function of the genome position, StereoGene outputs the local correlation track as part of the analysis. StereoGene also accounts for confounders such as input DNA by partial correlation. We apply our method to numerous comparisons of ChIP-Seq datasets from the Human Epigenome Atlas and FANTOM CAGE to demonstrate its wide applicability. We observe the changes in the correlation between epigenomic features across developmental trajectories of several tissue types consistent with known biology and find a novel spatial correlation of CAGE clusters with donor splice sites and with poly(A) sites. These analyses provide examples for the broad applicability of StereoGene for regulatory genomics.

AVAILABILITY AND IMPLEMENTATION

The StereoGene C ++ source code, program documentation, Galaxy integration scripts and examples are available from the project homepage http://stereogene.bioinf.fbb.msu.ru/.

CONTACT

favorov@sensi.org.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

具有相似全基因组分布的基因组特征通常被假设为具有功能相关性,例如,组蛋白和转录起始位点的共定位表明染色质调节转录因子活性。因此,需要统计算法来执行基因组特征之间的空间、全基因组相关性。

结果

在这里,我们提出了一种方法 StereoGene,它可以快速估计基因组特征对之间的全基因组相关性。这些特征可以代表映射到参考基因组或该参考基因组中基因组注释集的高通量数据。StereoGene 能够直接对连续数据进行相关,避免了数据的二值化和随后的数据丢失。使用核相关计算相邻基因组位置之间的相关性。将相关性表示为基因组位置的函数,StereoGene 将局部相关性轨迹作为分析的一部分输出。StereoGene 还通过偏相关来考虑输入 DNA 等混杂因素。我们将我们的方法应用于人类表观基因组图谱和 FANTOM CAGE 的大量 ChIP-Seq 数据集的比较中,以证明其广泛的适用性。我们观察到几种组织类型的发育轨迹中表观基因组特征之间的相关性发生变化,这与已知的生物学一致,并发现 CAGE 簇与供体剪接位点和 poly(A) 位点之间存在新的空间相关性。这些分析为 StereoGene 在调控基因组学中的广泛适用性提供了示例。

可用性和实现

StereoGene 的 C++源代码、程序文档、Galaxy 集成脚本和示例可从项目主页 http://stereogene.bioinf.fbb.msu.ru/ 获取。

联系人

favorov@sensi.org

补充信息

补充数据可在生物信息学在线获得。

相似文献

4
GLANET: genomic loci annotation and enrichment tool.GLANET:基因组位点注释和富集工具。
Bioinformatics. 2017 Sep 15;33(18):2818-2828. doi: 10.1093/bioinformatics/btx326.
6
Comparing genome-wide chromatin profiles using ChIP-chip or ChIP-seq.比较使用 ChIP-chip 或 ChIP-seq 的全基因组染色质图谱。
Bioinformatics. 2010 Apr 15;26(8):1000-6. doi: 10.1093/bioinformatics/btq087. Epub 2010 Mar 5.

引用本文的文献

2
Comprehensive analysis of RNA-chromatin, RNA-, and DNA-protein interactions.RNA-染色质、RNA和DNA-蛋白质相互作用的综合分析。
NAR Genom Bioinform. 2025 Feb 24;7(1):lqaf010. doi: 10.1093/nargab/lqaf010. eCollection 2025 Mar.
10
Analytical Approaches for ATAC-seq Data Analysis.ATAC-seq 数据分析的分析方法。
Curr Protoc Hum Genet. 2020 Jun;106(1):e101. doi: 10.1002/cphg.101.

本文引用的文献

6
Bursty gene expression in the intact mammalian liver.完整哺乳动物肝脏中的爆发式基因表达。
Mol Cell. 2015 Apr 2;58(1):147-56. doi: 10.1016/j.molcel.2015.01.027. Epub 2015 Feb 26.
10
Global quantitative modeling of chromatin factor interactions.染色质因子相互作用的全局定量建模
PLoS Comput Biol. 2014 Mar 27;10(3):e1003525. doi: 10.1371/journal.pcbi.1003525. eCollection 2014 Mar.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验