Lyubetsky Vassily A, Shilovsky Gregory A, Yang Jian-Rong, Seliverstov Alexandr V, Zverkov Oleg A
Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), 127051 Moscow, Russia.
Department of Genetics and Biomedical Informatics, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou 510080, China.
Biology (Basel). 2024 Oct 2;13(10):792. doi: 10.3390/biology13100792.
This article proposes a methodology for establishing a relationship between the change rate of a given gene (relative to a given taxon) together with the amino acid composition of the proteins encoded by this gene and the traits of the species containing this gene. The methodology is illustrated based on the mammalian genes responsible for regulating the circadian rhythms that underlie a number of human disorders, particularly those associated with aging. The methods used are statistical and bioinformatic ones. A systematic search for orthologues, pseudogenes, and gene losses was performed using our previously developed methods. It is demonstrated that the least conserved gene in the Euarchontoglires superorder exhibits a statistically significant connection of genomic characteristics (the median of / for a gene relative to all the other orthologous genes of a taxon, as well as the preference or avoidance of certain amino acids in its protein) with species-specific lifespan and body weight. In contrast, no such connection is observed for in the Laurasiatheria superorder. This study goes beyond the protein-coding genes, since the accumulation of amino acid substitutions in the course of evolution leads to pseudogenization and even gene loss, although the relationship between the genomic characteristics and the species traits is still preserved. The proposed methodology is illustrated using the examples of circadian rhythm genes and proteins in placental mammals, e.g., longevity is connected with the rate of gene change, pseudogenization or gene loss, and specific amino acid substitutions (e.g., asparagine at the 19th position of the CRY-binding domain) in the protein encoded by this gene.
本文提出了一种方法,用于建立给定基因的变化率(相对于给定分类单元)与该基因编码蛋白质的氨基酸组成以及包含该基因的物种特征之间的关系。基于负责调节昼夜节律的哺乳动物基因对该方法进行了说明,昼夜节律是许多人类疾病的基础,尤其是与衰老相关的疾病。所使用的方法是统计方法和生物信息学方法。使用我们之前开发的方法对直系同源基因、假基因和基因丢失进行了系统搜索。结果表明,真灵总目(Euarchontoglires)中保守性最低的基因在基因组特征(相对于一个分类单元的所有其他直系同源基因,一个基因的/中位数,以及其蛋白质中对某些氨基酸的偏好或回避)与物种特异性寿命和体重之间表现出统计学上的显著联系。相比之下,在劳亚兽总目(Laurasiatheria)中未观察到这种联系。这项研究超越了蛋白质编码基因,因为在进化过程中氨基酸替换的积累会导致假基因化甚至基因丢失,尽管基因组特征与物种特征之间的关系仍然存在。使用胎盘哺乳动物中昼夜节律基因和蛋白质的例子对所提出的方法进行了说明,例如,寿命与该基因的变化率、假基因化或基因丢失以及该基因编码蛋白质中的特定氨基酸替换(例如,CRY结合域第19位的天冬酰胺)有关。