Wang Dandan, Li Daixi, Qin Guangrong, Zhang Wen, Ouyang Jian, Zhang Menghuan, Xie Lu
Institute of Food Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China.
Shanghai Center for Bioinformation Technology, Shanghai Academy of Science and Technology, Shanghai 201203, China.
Comput Math Methods Med. 2015;2015:912742. doi: 10.1155/2015/912742. Epub 2015 Aug 10.
Chromosomal translocation, which generates fusion proteins in blood tumor or solid tumor, is considered as one of the major causes leading to cancer. Recent studies suggested that the disordered fragments in a fusion protein might contribute to its carcinogenicity. Here, we investigated the sequence feature near the breakpoints in the fusion partner genes, the structure features of breakpoints in fusion proteins, and the posttranslational modification preference in the fusion proteins. Results show that the breakpoints in the fusion partner genes have both sequence preference and structural preference. At the sequence level, nucleotide combination AG is preferred before the breakpoint and GG is preferred at the breakpoint. At the structural level, the breakpoints in the fusion proteins prefer to be located in the disordered regions. Further analysis suggests the phosphorylation sites at serine, threonine, and the methylation sites at arginine are enriched in disordered regions of the fusion proteins. Using EML4-ALK as an example, we further explained how the fusion protein leads to the protein disorder and contributes to its carcinogenicity. The sequence and structural features of the fusion proteins may help the scientific community to predict novel breakpoints in fusion genes and better understand the structure and function of fusion proteins.
染色体易位在血液肿瘤或实体瘤中产生融合蛋白,被认为是导致癌症的主要原因之一。最近的研究表明,融合蛋白中的无序片段可能与其致癌性有关。在此,我们研究了融合伙伴基因断点附近的序列特征、融合蛋白断点的结构特征以及融合蛋白的翻译后修饰偏好。结果表明,融合伙伴基因中的断点具有序列偏好和结构偏好。在序列水平上,断点前核苷酸组合AG更受青睐,断点处GG更受青睐。在结构水平上,融合蛋白中的断点更倾向于位于无序区域。进一步分析表明,丝氨酸和苏氨酸的磷酸化位点以及精氨酸的甲基化位点在融合蛋白的无序区域中富集。以EML4-ALK为例,我们进一步解释了融合蛋白如何导致蛋白质无序并促成其致癌性。融合蛋白的序列和结构特征可能有助于科学界预测融合基因中的新断点,并更好地理解融合蛋白的结构和功能。