Zhou Weiqiang, Hongkai Ji
Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, 615 North Wolfe Street, Baltimore, MD 21205, USA.
Wiley Interdiscip Rev Comput Stat. 2021 Sep-Oct;13(5). doi: 10.1002/wics.1544. Epub 2020 Dec 17.
Decoding gene regulation in a biological system requires information from both transcriptome and regulome. While multiple high-throughput transcriptome and regulome mapping technologies are available, transcriptome profiling is more widely used. Today, over a million bulk and single-cell gene expression samples are stored in public databases. This number is orders of magnitude larger than the number of available regulome samples. Most of the gene expression samples do not have corresponding regulome data. However, it is possible to obtain regulome information via prediction. Open chromatin is a hallmark of active regulatory elements. This mini-review discusses recent advances in predicting chromatin accessibility using gene expression data, including both the development of prediction methods and their applications in expanding the regulome catalog, improving regulome analysis, integrating transcriptome and regulome data, and facilitating single-cell analysis of gene regulation.
解码生物系统中的基因调控需要来自转录组和调控组的信息。虽然有多种高通量转录组和调控组图谱绘制技术,但转录组分析的应用更为广泛。如今,超过一百万个批量和单细胞基因表达样本存储在公共数据库中。这个数字比可用的调控组样本数量大几个数量级。大多数基因表达样本没有相应的调控组数据。然而,通过预测获得调控组信息是可能的。开放染色质是活跃调控元件的一个标志。这篇小型综述讨论了利用基因表达数据预测染色质可及性的最新进展,包括预测方法的发展及其在扩展调控组目录、改进调控组分析、整合转录组和调控组数据以及促进基因调控的单细胞分析中的应用。