用于多种人类组织系统注释的表观基因组数据集的大规模插补

Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues.

作者信息

Ernst Jason, Kellis Manolis

机构信息

1] Department of Biological Chemistry, University of California, Los Angeles, California, USA. [2] Computer Science Department, University of California, Los Angeles, California, USA. [3] Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research at UCLA, Los Angeles, California, USA. [4] Jonsson Comprehensive Cancer Center, University of California, Los Angeles, California, USA. [5] Molecular Biology Institute, University of California, Los Angeles, California, USA.

1] MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, Massachusetts, USA. [2] Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA.

出版信息

Nat Biotechnol. 2015 Apr;33(4):364-76. doi: 10.1038/nbt.3157. Epub 2015 Feb 18.

DOI:10.1038/nbt.3157

PMID:25690853

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4512306/

Abstract

With hundreds of epigenomic maps, the opportunity arises to exploit the correlated nature of epigenetic signals, across both marks and samples, for large-scale prediction of additional datasets. Here, we undertake epigenome imputation by leveraging such correlations through an ensemble of regression trees. We impute 4,315 high-resolution signal maps, of which 26% are also experimentally observed. Imputed signal tracks show overall similarity to observed signals and surpass experimental datasets in consistency, recovery of gene annotations and enrichment for disease-associated variants. We use the imputed data to detect low-quality experimental datasets, to find genomic sites with unexpected epigenomic signals, to define high-priority marks for new experiments and to delineate chromatin states in 127 reference epigenomes spanning diverse tissues and cell types. Our imputed datasets provide the most comprehensive human regulatory region annotation to date, and our approach and the ChromImpute software constitute a useful complement to large-scale experimental mapping of epigenomic information.

摘要

有了数百个表观基因组图谱，就有机会利用表观遗传信号在标记和样本间的相关性，对其他数据集进行大规模预测。在此，我们通过回归树集成利用这种相关性进行表观基因组插补。我们插补了4315个高分辨率信号图谱，其中26%也通过实验观察到。插补的信号轨迹与观察到的信号总体相似，在一致性、基因注释恢复和疾病相关变异富集方面超过了实验数据集。我们使用插补数据来检测低质量的实验数据集，寻找具有意外表观遗传信号的基因组位点，为新实验定义高优先级标记，并在涵盖不同组织和细胞类型的127个参考表观基因组中描绘染色质状态。我们的插补数据集提供了迄今为止最全面的人类调控区域注释，我们的方法和ChromImpute软件构成了表观基因组信息大规模实验图谱的有用补充。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9491/4512306/1be15fcd0044/nihms-660789-f0001.jpg

相似文献

Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues.用于多种人类组织系统注释的表观基因组数据集的大规模插补

Nat Biotechnol. 2015 Apr;33(4):364-76. doi: 10.1038/nbt.3157. Epub 2015 Feb 18.

Improving reference epigenome catalogs by computational prediction.通过计算预测改进参考表观基因组目录。

Nat Biotechnol. 2015 Apr;33(4):354-5. doi: 10.1038/nbt.3194.

Epigenomic annotation of genetic variants using the Roadmap Epigenome Browser.使用路线图表观基因组浏览器对遗传变异进行表观基因组注释。

Nat Biotechnol. 2015 Apr;33(4):345-6. doi: 10.1038/nbt.3158.

Using the Ensembl genome server to browse genomic sequence data.使用Ensembl基因组服务器浏览基因组序列数据。

Curr Protoc Bioinformatics. 2007 Jan;Chapter 1:Unit 1.15. doi: 10.1002/0471250953.bi0115s16.

Accurate and reproducible functional maps in 127 human cell types via 2D genome segmentation.通过二维基因组分割在127种人类细胞类型中生成准确且可重复的功能图谱。

Nucleic Acids Res. 2017 Sep 29;45(17):9823-9836. doi: 10.1093/nar/gkx659.

Integrative analysis of 111 reference human epigenomes.111 个人类参考基因组的综合分析。

Nature. 2015 Feb 19;518(7539):317-30. doi: 10.1038/nature14248.

GeneTools--application for functional annotation and statistical hypothesis testing.基因工具——用于功能注释和统计假设检验的应用程序。

BMC Bioinformatics. 2006 Oct 24;7:470. doi: 10.1186/1471-2105-7-470.

Automated querying of genome databases.基因组数据库的自动查询。

PLoS Comput Biol. 2007 Jan 26;3(1):e1. doi: 10.1371/journal.pcbi.0030001.

CoCo: a web application to display, store and curate ChIP-on-chip data integrated with diverse types of gene expression data.CoCo：一个用于显示、存储和管理与多种类型基因表达数据整合的芯片上的染色质免疫沉淀数据的网络应用程序。

Bioinformatics. 2007 Mar 15;23(6):771-3. doi: 10.1093/bioinformatics/btl641. Epub 2007 Jan 17.

Sungear: interactive visualization and functional analysis of genomic datasets.Sungear：基因组数据集的交互式可视化与功能分析

Bioinformatics. 2007 Jan 15;23(2):259-61. doi: 10.1093/bioinformatics/btl496. Epub 2006 Oct 2.

引用本文的文献

Chromatin state dynamics during the intraerythrocytic development cycle.红细胞内发育周期中的染色质状态动态变化。

bioRxiv. 2025 Aug 28:2025.08.22.671872. doi: 10.1101/2025.08.22.671872.

mbSparse: an autoencoder-based imputation method to address sparsity in microbiome data.mbSparse：一种基于自动编码器的插补方法，用于解决微生物组数据中的稀疏性问题。

Gut Microbes. 2025 Dec;17(1):2552347. doi: 10.1080/19490976.2025.2552347. Epub 2025 Sep 1.

Multi-omics based and AI-driven drug repositioning for epigenetic therapy in female malignancies.基于多组学和人工智能驱动的女性恶性肿瘤表观遗传治疗药物重新定位

J Transl Med. 2025 Jul 25;23(1):837. doi: 10.1186/s12967-025-06856-x.

Perspective on recent developments and challenges in regulatory and systems genomics.监管与系统基因组学的最新进展及挑战之展望

Bioinform Adv. 2025 May 9;5(1):vbaf106. doi: 10.1093/bioadv/vbaf106. eCollection 2025.

Genome-wide identification and analysis of recurring patterns of epigenetic variation across individuals.全基因组范围内对个体间表观遗传变异重复模式的鉴定与分析。

Commun Biol. 2025 Jun 7;8(1):888. doi: 10.1038/s42003-025-08179-5.

CMImpute: cross-species and tissue imputation of species-level DNA methylation samples across mammalian species.CMImpute：跨哺乳动物物种的物种水平DNA甲基化样本的跨物种和组织插补

Genome Biol. 2025 May 20;26(1):133. doi: 10.1186/s13059-025-03561-2.

ChromActivity: integrative epigenomic and functional characterization assay based annotation of regulatory activity across diverse human cell types.ChromActivity：基于整合表观基因组和功能特征分析的多种人类细胞类型调控活性注释方法。

Genome Biol. 2025 May 9;26(1):123. doi: 10.1186/s13059-025-03579-6.

Single-cell multiomics of neuronal activation reveals context-dependent genetic control of brain disorders.神经元激活的单细胞多组学揭示了脑部疾病的背景依赖性基因控制。

bioRxiv. 2025 Feb 17:2025.02.17.638682. doi: 10.1101/2025.02.17.638682.

Whole genome sequence-based association analysis of African American individuals with bipolar disorder and schizophrenia.基于全基因组序列的非裔美国双相情感障碍和精神分裂症患者关联分析。

medRxiv. 2025 Feb 19:2024.12.27.24319111. doi: 10.1101/2024.12.27.24319111.

Understanding relationships between epigenetic marks and their application to robust assignment of chromatin states.理解表观遗传标记之间的关系及其在染色质状态可靠分配中的应用。

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae638.

本文引用的文献

Integrative analysis of 111 reference human epigenomes.111 个人类参考基因组的综合分析。

Nature. 2015 Feb 19;518(7539):317-30. doi: 10.1038/nature14248.

Modeling DNA methylation dynamics with approaches from phylogenetics.运用系统发育学方法对DNA甲基化动力学进行建模。

Bioinformatics. 2014 Sep 1;30(17):i408-14. doi: 10.1093/bioinformatics/btu445.

Global quantitative modeling of chromatin factor interactions.染色质因子相互作用的全局定量建模

PLoS Comput Biol. 2014 Mar 27;10(3):e1003525. doi: 10.1371/journal.pcbi.1003525. eCollection 2014 Mar.

Systematic discovery and characterization of regulatory motifs in ENCODE TF binding experiments.系统发现和描绘 ENCODE TF 结合实验中的调控基序。

Nucleic Acids Res. 2014 Mar;42(5):2976-87. doi: 10.1093/nar/gkt1249. Epub 2013 Dec 13.

Track data hubs enable visualization of user-defined genome-wide annotations on the UCSC Genome Browser.轨迹数据枢纽允许在 UCSC 基因组浏览器上可视化用户定义的全基因组注释。

Bioinformatics. 2014 Apr 1;30(7):1003-5. doi: 10.1093/bioinformatics/btt637. Epub 2013 Nov 13.

Identification of genetic variants that affect histone modifications in human cells.鉴定影响人类细胞组蛋白修饰的遗传变异。

Science. 2013 Nov 8;342(6159):747-9. doi: 10.1126/science.1242429. Epub 2013 Oct 17.

Extensive variation in chromatin states across humans.人类染色质状态的广泛变异。

Science. 2013 Nov 8;342(6159):750-2. doi: 10.1126/science.1242510. Epub 2013 Oct 17.

Coordinated effects of sequence variation on DNA binding, chromatin structure, and transcription.序列变异对 DNA 结合、染色质结构和转录的协调影响。

Science. 2013 Nov 8;342(6159):744-7. doi: 10.1126/science.1242463. Epub 2013 Oct 17.

Finding associations among histone modifications using sparse partial correlation networks.利用稀疏偏相关网络发现组蛋白修饰之间的关联。

PLoS Comput Biol. 2013;9(9):e1003168. doi: 10.1371/journal.pcbi.1003168. Epub 2013 Sep 5.

Charting a dynamic DNA methylation landscape of the human genome.绘制人类基因组动态 DNA 甲基化图谱。

Nature. 2013 Aug 22;500(7463):477-81. doi: 10.1038/nature12433. Epub 2013 Aug 7.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用于多种人类组织系统注释的表观基因组数据集的大规模插补

Large-scale imputation of epigenomic datasets for systematic annotation of diverse human tissues.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献