Lopes Inês, Altab Gulam, Raina Priyanka, de Magalhães João Pedro
Integrative Genomics of Ageing Group, Institute of Ageing and Chronic Disease, University of Liverpool, Liverpool, United Kingdom.
Front Genet. 2021 Feb 11;12:559998. doi: 10.3389/fgene.2021.559998. eCollection 2021.
While it is expected for gene length to be associated with factors such as intron number and evolutionary conservation, we are yet to understand the connections between gene length and function in the human genome. In this study, we show that, as expected, there is a strong positive correlation between gene length, transcript length, and protein size as well as a correlation with the number of genetic variants and introns. Among tissue-specific genes, we find that the longest transcripts tend to be expressed in the blood vessels, nerves, thyroid, cervix uteri, and the brain, while the smallest transcripts tend to be expressed in the pancreas, skin, stomach, vagina, and testis. We report, as shown previously, that natural selection suppresses changes for genes with longer transcripts and promotes changes for genes with smaller transcripts. We also observe that genes with longer transcripts tend to have a higher number of co-expressed genes and protein-protein interactions, as well as more associated publications. In the functional analysis, we show that bigger transcripts are often associated with neuronal development, while smaller transcripts tend to play roles in skin development and in the immune system. Furthermore, pathways related to cancer, neurons, and heart diseases tend to have genes with longer transcripts, with smaller transcripts being present in pathways related to immune responses and neurodegenerative diseases. Based on our results, we hypothesize that longer genes tend to be associated with functions that are important in the early development stages, while smaller genes tend to play a role in functions that are important throughout the whole life, like the immune system, which requires fast responses.
虽然基因长度预计会与内含子数量和进化保守性等因素相关,但我们尚未了解人类基因组中基因长度与功能之间的联系。在本研究中,我们发现,正如预期的那样,基因长度、转录本长度和蛋白质大小之间存在很强的正相关,同时也与遗传变异和内含子数量相关。在组织特异性基因中,我们发现最长的转录本往往在血管、神经、甲状腺、子宫颈和大脑中表达,而最小的转录本往往在胰腺、皮肤、胃、阴道和睾丸中表达。正如之前所表明的,我们报告自然选择抑制转录本较长的基因的变化,并促进转录本较短的基因的变化。我们还观察到,转录本较长的基因往往有更多共表达基因和蛋白质 - 蛋白质相互作用,以及更多相关出版物。在功能分析中,我们表明较大的转录本通常与神经元发育相关,而较小的转录本往往在皮肤发育和免疫系统中发挥作用。此外,与癌症、神经元和心脏病相关的通路往往有转录本较长的基因,而与免疫反应和神经退行性疾病相关的通路中存在转录本较短的基因。基于我们的结果,我们推测较长的基因往往与早期发育阶段重要的功能相关,而较小的基因往往在整个生命过程中重要的功能中发挥作用,比如免疫系统,它需要快速反应。