School of Biological Sciences, University of Adelaide, Adelaide, South Australia, Australia.
Childhood Dementia Research Group, College of Medicine and Public Health, Flinders Health and Medical Research Institute, Flinders University, Bedford Park, South Australia, Australia.
PLoS Comput Biol. 2024 Feb 12;20(2):e1011868. doi: 10.1371/journal.pcbi.1011868. eCollection 2024 Feb.
In comparisons between mutant and wild-type genotypes, transcriptome analysis can reveal the direct impacts of a mutation, together with the homeostatic responses of the biological system. Recent studies have highlighted that, when the effects of homozygosity for recessive mutations are studied in non-isogenic backgrounds, genes located proximal to the mutation on the same chromosome often appear over-represented among those genes identified as differentially expressed (DE). One hypothesis suggests that DE genes chromosomally linked to a mutation may not reflect functional responses to the mutation but, instead, result from an unequal distribution of expression quantitative trait loci (eQTLs) between sample groups of mutant or wild-type genotypes. This is problematic because eQTL expression differences are difficult to distinguish from genes that are DE due to functional responses to a mutation. Here we show that chromosomally co-located differentially expressed genes (CC-DEGs) are also observed in analyses of dominant mutations in heterozygotes. We define a method and a metric to quantify, in RNA-sequencing data, localised differential allelic representation (DAR) between those sample groups subjected to differential expression analysis. We show how the DAR metric can predict regions prone to eQTL-driven differential expression, and how it can improve functional enrichment analyses through gene exclusion or weighting-based approaches. Advantageously, this improved ability to identify probable eQTLs also reveals examples of CC-DEGs that are likely to be functionally related to a mutant phenotype. This supports a long-standing prediction that selection for advantageous linkage disequilibrium influences chromosome evolution. By comparing the genomes of zebrafish (Danio rerio) and medaka (Oryzias latipes), a teleost with a conserved ancestral karyotype, we find possible examples of chromosomal aggregation of CC-DEGs during evolution of the zebrafish lineage. Our method for DAR analysis requires only RNA-sequencing data, facilitating its application across new and existing datasets.
在比较突变体和野生型基因型时,转录组分析可以揭示突变的直接影响,以及生物系统的动态平衡反应。最近的研究强调,当在非同系背景下研究隐性突变纯合子的影响时,位于同一染色体上突变点近端的基因在被鉴定为差异表达(DE)的基因中经常出现过度表达。一种假设认为,与突变相关的 DE 基因可能不是对突变的功能反应,而是由于突变体或野生型基因型样本组之间表达数量性状基因座(eQTL)的不均匀分布所致。这是有问题的,因为 eQTL 表达差异很难与由于对突变的功能反应而导致 DE 的基因区分开来。在这里,我们表明,在杂合子中显性突变的分析中也观察到了染色体共定位差异表达基因(CC-DEGs)。我们定义了一种方法和度量标准,用于在 RNA-seq 数据中量化那些进行差异表达分析的样本组之间局部差异等位基因表达(DAR)。我们展示了 DAR 度量标准如何可以预测易受 eQTL 驱动的差异表达的区域,以及如何通过基于基因排除或加权的方法来改善功能富集分析。有利的是,这种识别可能的 eQTL 的能力的提高还揭示了 CC-DEGs 可能与突变表型相关的功能相关的例子。这支持了一个长期以来的预测,即有利连锁不平衡的选择影响染色体进化。通过比较斑马鱼(Danio rerio)和青鳉(Oryzias latipes)的基因组,一种具有保守祖先染色体组型的硬骨鱼,我们发现了在斑马鱼谱系进化过程中 CC-DEGs 染色体聚集的可能例子。我们的 DAR 分析方法仅需要 RNA-seq 数据,便于在新的和现有的数据集上应用。