Li Hongyang, Guan Yuanfang
Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI 48109, USA.
Nat Mach Intell. 2022 Mar;4(3):288-299. doi: 10.1038/s42256-022-00455-x. Epub 2022 Mar 21.
Decoding the epigenomic landscapes in diverse tissues and cell types is fundamental to understanding molecular mechanisms underlying many essential cellular processes and human diseases. Recent advances in artificial intelligence provide new methods and strategies for imputing unknown epigenomes based on existing data, yet how to reveal the predictive relationships among epigenetic marks remains largely unexplored. Here we present a machine learning approach for epigenomic imputation and interpretation. Through dissection of the spatial contributions from six histone marks, we reveal the prevalent and asymmetric cross-prediction relationships among these marks. Meanwhile, our approach achieved high predictive performance on held-out prospective epigenomes and outperformed the state-of-the-art. To facilitate future research, we further applied this approach to impute a total of 527 and 2,455 unavailable genome-wide histone modification signal tracks for the ENCODE3 and Roadmap datasets, respectively.
解码不同组织和细胞类型中的表观基因组景观对于理解许多基本细胞过程和人类疾病背后的分子机制至关重要。人工智能的最新进展提供了基于现有数据推断未知表观基因组的新方法和策略,但如何揭示表观遗传标记之间的预测关系在很大程度上仍未得到探索。在此,我们提出了一种用于表观基因组推断和解释的机器学习方法。通过剖析六种组蛋白标记的空间贡献,我们揭示了这些标记之间普遍存在的不对称交叉预测关系。同时,我们的方法在保留的前瞻性表观基因组上实现了高预测性能,并且优于现有技术。为了促进未来的研究,我们进一步应用此方法分别为ENCODE3和路线图表观基因组学数据集推断了总共527个和2455个不可用的全基因组组蛋白修饰信号轨迹。