Xu Tianlei, Zheng Xiaoqi, Li Ben, Jin Peng, Qin Zhaohui, Wu Hao
Department of Mathematics and Computer Science, Emory University, Atlanta, GA, USA.
Department of Mathematics, Shanghai Normal University, Shanghai, China.
Brief Bioinform. 2020 Jan 17;21(1):120-134. doi: 10.1093/bib/bby110.
There are significant correlations among different types of genetic, genomic and epigenomic features within the genome. These correlations make the in silico feature prediction possible through statistical or machine learning models. With the accumulation of a vast amount of high-throughput data, feature prediction has gained significant interest lately, and a plethora of papers have been published in the past few years. Here we provide a comprehensive review on these published works, categorized by the prediction targets, including protein binding site, enhancer, DNA methylation, chromatin structure and gene expression. We also provide discussions on some important points and possible future directions.
基因组内不同类型的遗传、基因组和表观基因组特征之间存在显著相关性。这些相关性使得通过统计或机器学习模型进行计算机特征预测成为可能。随着大量高通量数据的积累,特征预测最近引起了广泛关注,在过去几年中发表了大量相关论文。在此,我们对这些已发表的研究进行全面综述,按照预测目标进行分类,包括蛋白质结合位点、增强子、DNA甲基化、染色质结构和基因表达。我们还对一些要点和可能的未来方向进行了讨论。