Institute of Mathematical Problems of Biology RAS-the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, Pushchino, Moscow Region, Russia.
Phystech School of Applied Mathematics and Informatics, Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Moscow Region, Russia.
PLoS One. 2020 May 29;15(5):e0233978. doi: 10.1371/journal.pone.0233978. eCollection 2020.
Intronic gene regions are mostly considered in the scope of gene expression regulation, such as alternative splicing. However, relations between basic statistical properties of introns are much rarely studied in detail, despite vast available data. Particularly, little is known regarding the relationship between the intron length and the intron phase. Intron phase distribution is significantly different at different intron length thresholds. In this study, we performed GO enrichment analysis of gene sets with a particular intron phase at varying intron length thresholds using a list of 13823 orthologous human-mouse gene pairs. We found a specific group of 153 genes with phase 1 introns longer than 50 kilobases that were specifically expressed in brain, functionally related to synaptic signaling, and strongly associated with schizophrenia and other mental disorders. We propose that the prevalence of long phase 1 introns arises from the presence of the signal peptide sequence and is connected with 1-1 exon shuffling.
内含子基因区域主要被认为在基因表达调控范围内,如选择性剪接。然而,尽管有大量可用的数据,内含子基本统计属性之间的关系却很少被详细研究。特别是,关于内含子长度和内含子相位之间的关系知之甚少。在不同的内含子长度阈值下,内含子相位分布差异显著。在这项研究中,我们使用 13823 对同源的人-鼠基因对列表,在不同的内含子长度阈值下,对具有特定内含子相位的基因集进行了 GO 富集分析。我们发现了一个特定的基因群,其中 153 个基因的第 1 内含子长度超过 50kb,这些基因在大脑中特异性表达,与突触信号功能相关,与精神分裂症和其他精神障碍密切相关。我们提出,长第 1 内含子的普遍性源于信号肽序列的存在,并与 1-1 外显子改组有关。