Xu Yungang, Wang Yongcui, Luo Jiesi, Zhao Weiling, Zhou Xiaobo
Center for Systems Medicine, School of Biomedical Bioinformatics, University of Texas Health Science Center at Houston, TX 77030, USA.
Center for Bioinformatics and Systems Biology, Wake Forest School of Medicine, Winston-Salem, NC 27157, USA.
Nucleic Acids Res. 2017 Dec 1;45(21):12100-12112. doi: 10.1093/nar/gkx870.
Alternative splicing (AS) is a genetically and epigenetically regulated pre-mRNA processing to increase transcriptome and proteome diversity. Comprehensively decoding these regulatory mechanisms holds promise in getting deeper insights into a variety of biological contexts involving in AS, such as development and diseases. We assembled splicing (epi)genetic code, DeepCode, for human embryonic stem cell (hESC) differentiation by integrating heterogeneous features of genomic sequences, 16 histone modifications with a multi-label deep neural network. With the advantages of epigenetic features, DeepCode significantly improves the performance in predicting the splicing patterns and their changes during hESC differentiation. Meanwhile, DeepCode reveals the superiority of epigenomic features and their dominant roles in decoding AS patterns, highlighting the necessity of including the epigenetic properties when assembling a more comprehensive splicing code. Moreover, DeepCode allows the robust predictions across cell lineages and datasets. Especially, we identified a putative H3K36me3-regulated AS event leading to a nonsense-mediated mRNA decay of BARD1. Reduced BARD1 expression results in the attenuation of ATM/ATR signalling activities and further the hESC differentiation. These results suggest a novel candidate mechanism linking histone modifications to hESC fate decision. In addition, when trained in different contexts, DeepCode can be expanded to a variety of biological and biomedical fields.
可变剪接(Alternative splicing,AS)是一种受遗传和表观遗传调控的前体mRNA加工过程,可增加转录组和蛋白质组的多样性。全面解码这些调控机制有望更深入地了解涉及可变剪接的各种生物学背景,如发育和疾病。我们通过整合基因组序列的异质特征、16种组蛋白修饰与多标签深度神经网络,构建了用于人类胚胎干细胞(hESC)分化的剪接(表观)遗传密码DeepCode。凭借表观遗传特征的优势,DeepCode显著提高了预测hESC分化过程中剪接模式及其变化的性能。同时,DeepCode揭示了表观基因组特征的优越性及其在解码可变剪接模式中的主导作用,突出了在构建更全面的剪接密码时纳入表观遗传特性的必要性。此外,DeepCode能够在不同细胞谱系和数据集上进行稳健的预测。特别是,我们鉴定了一个推定的H3K36me3调控的可变剪接事件,该事件导致BARD1的无义介导的mRNA降解。BARD1表达降低导致ATM/ATR信号活性减弱,进而影响hESC分化。这些结果提示了一种将组蛋白修饰与hESC命运决定联系起来的新候选机制。此外,当在不同背景下进行训练时,DeepCode可扩展到各种生物学和生物医学领域。