Jiang Limin, Duan Mingrui, Guo Fei, Tang Jijun, Oybamiji Olufunmilola, Yu Hui, Ness Scott, Zhao Ying-Yong, Mao Peng, Guo Yan
Comprehensive Cancer Center, Department of Internal Medicine, University of New Mexico, Albuquerque, NM 87109, USA.
School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin 300350, China.
NAR Cancer. 2020 Dec;2(4):zcaa030. doi: 10.1093/narcan/zcaa030. Epub 2020 Oct 13.
Binding motifs for transcription factors, RNA-binding proteins, microRNAs (miRNAs), etc. are vital for proper gene transcription and translation regulation. Sequence alteration mechanisms including single nucleotide mutations, insertion, deletion, RNA editing and single nucleotide polymorphism can lead to gains and losses of binding motifs; such consequentially emerged or vanished binding motifs are termed 'somatic motifs' by us. Somatic motifs have been studied sporadically but have never been curated into a comprehensive resource. By analyzing various types of sequence altering data from large consortiums, we successfully identified millions of somatic motifs, including those for important transcription factors, RNA-binding proteins, miRNA seeds and miRNA-mRNA 3'-UTR target motifs. While a few of these somatic motifs have been well studied, our results contain many novel somatic motifs that occur at high frequency and are thus likely to cause important biological repercussions. Genes targeted by these altered motifs are excellent candidates for further mechanism studies. Here, we present the first database that hosts millions of somatic motifs ascribed to a variety of sequence alteration mechanisms.
转录因子、RNA结合蛋白、微小RNA(miRNA)等的结合基序对于正确的基因转录和翻译调控至关重要。包括单核苷酸突变、插入、缺失、RNA编辑和单核苷酸多态性在内的序列改变机制可导致结合基序的增减;我们将这些因此出现或消失的结合基序称为“体细胞基序”。体细胞基序虽已得到零星研究,但从未被整理成一个全面的资源库。通过分析来自大型联盟的各类序列改变数据,我们成功鉴定出数百万个体细胞基序,包括重要转录因子、RNA结合蛋白、miRNA种子序列以及miRNA-mRNA 3'-UTR靶向基序的结合基序。虽然其中一些体细胞基序已得到充分研究,但我们的结果包含许多高频出现的新型体细胞基序,因此可能会产生重要的生物学影响。这些改变基序所靶向的基因是进一步机制研究的优秀候选对象。在此,我们展示了首个包含数百万个归因于多种序列改变机制的体细胞基序的数据库。