Cridland Julie M, Polston Elizabeth S, Begun David J
Department of Evolution and Ecology, University of California, Davis, Davis, CA 95616, USA.
Genetics. 2025 May 8;230(1). doi: 10.1093/genetics/iyaf044.
De novo genes can be defined as sequences producing evolutionarily derived transcripts that are not homologous to transcripts produced in an ancestor. While they appear to be taxonomically widespread, there is little agreement regarding their abundance, their persistence times in genomes, the population genetic processes responsible for their spread or loss, or their possible functions. In Drosophila melanogaster, 2 approaches have been used to discover these genes and investigate their properties. One uses traditional comparative approaches and existing genomic resources and annotations. A second approach uses raw transcriptome data to discover unannotated genes for which there is no evidence of presence in related species. Investigations using the second approach have focused on D. melanogaster genotypes from recently established cosmopolitan populations. However, most of the genetic variation in the species is found in African populations, suggesting the possibility that fuller understanding of genetic novelties in the species may follow from studies of these populations. Here, we investigate de novo gene candidates expressed in testis and accessory glands in a sample of flies from Zambia and compare them with candidate de novo genes expressed in North American populations. We report a large number of previously undiscovered de novo gene candidates, most of which are expressed polymorphically. Many are predicted to code for secreted proteins. In spite of much different levels of genomic variation in Zambian and North American populations, they express similar numbers of candidate de novo genes. We find evidence from genetic analysis of Raleigh inbred lines that a fraction of rarely expressed gene candidates in this population represent deleterious transcription promoted by inbreeding depression. Many de novo gene candidates are expressed in multiple tissues and both sexes, raising questions about how they may interact with natural selection. The relative importance of positive and negative selection, however, remains unclear.
从头基因可以定义为产生进化衍生转录本的序列,这些转录本与祖先产生的转录本不同源。虽然它们似乎在分类学上广泛存在,但关于它们的丰度、在基因组中的持续时间、负责其传播或丢失的群体遗传过程,或它们可能的功能,几乎没有达成共识。在黑腹果蝇中,已经使用了两种方法来发现这些基因并研究它们的特性。一种方法使用传统的比较方法以及现有的基因组资源和注释。第二种方法使用原始转录组数据来发现未注释的基因,这些基因在相关物种中没有存在的证据。使用第二种方法的研究集中在最近建立的世界性种群的黑腹果蝇基因型上。然而,该物种的大多数遗传变异存在于非洲种群中,这表明对这些种群的研究可能会更全面地了解该物种的遗传新奇性。在这里,我们研究了来自赞比亚的果蝇样本中睾丸和附属腺中表达的从头基因候选物,并将它们与北美种群中表达的从头基因候选物进行比较。我们报告了大量以前未发现的从头基因候选物,其中大多数是多态性表达的。许多被预测编码分泌蛋白。尽管赞比亚和北美种群的基因组变异水平有很大差异,但它们表达的从头基因候选物数量相似。我们从罗利近交系的遗传分析中发现证据,表明该种群中一小部分很少表达的基因候选物代表了近交衰退促进的有害转录。许多从头基因候选物在多个组织和两性中都有表达,这引发了关于它们如何与自然选择相互作用的问题。然而,正选择和负选择的相对重要性仍不清楚。