Goldberg Jay K, Allan Carson W, Copetti Dario, Matzkin Luciano M, Bronstein Judith
Department of Ecology and Evolutionary Biology University of Arizona Tucson Arizona USA.
Department of Cellular and Developmental Biology John Innes Centre Norwich Norfolk UK.
Ecol Evol. 2024 Mar 11;14(3):e10979. doi: 10.1002/ece3.10979. eCollection 2024 Mar.
The assembly of genomes from pooled samples of genetically heterogenous samples of conspecifics remains challenging. In this study, we show that high-quality genome assemblies can be produced from samples of multiple wild-caught individuals. We sequenced DNA extracted from a pooled sample of conspecific herbivorous insects (Hemiptera: Miridae: ) acquired from a greenhouse infestation in Tucson, Arizona (in the range of 30-100 individuals; 0.5 mL tissue by volume) using PacBio highly accurate long reads (HiFi). The initial assembly contained multiple haplotigs (>85% BUSCOs duplicated), but duplicate contigs could be easily purged to reveal a highly complete assembly (95.6% BUSCO, 4.4% duplicated) that is highly contiguous by short-read assembly standards ( = 675 kb; Largest contig = 4.3 Mb). We then used our assembly as the basis for a genome-guided differential expression study of host plant-specific transcriptional responses. We found thousands of genes ( = 4982) to be differentially expressed between our new data from individuals feeding on (Solanaceae) and existing RNA-seq data from (Solanaceae)-fed individuals. We identified many of these genes as previously documented detoxification genes such as glutathione-S-transferases, cytochrome P450s, and UDP-glucosyltransferases. Together our results show that long-read sequencing of pooled samples can provide a cost-effective genome assembly option for small insects and can provide insights into the genetic mechanisms underlying interactions between plants and herbivorous pests.
从同种生物基因异质样本的混合样本中组装基因组仍然具有挑战性。在本研究中,我们表明可以从多个野生捕获个体的样本中产生高质量的基因组组装。我们对从亚利桑那州图森市温室虫害中采集的同种食草昆虫(半翅目:盲蝽科:)的混合样本中提取的DNA进行了测序(个体数量在30 - 100之间;按体积计0.5 mL组织),使用PacBio高准确度长读长(HiFi)技术。初始组装包含多个单倍型重叠群(>85%的BUSCOs重复),但重复的重叠群可以很容易地被清除,从而得到一个高度完整的组装(95.6%的BUSCO,4.4%重复),按照短读长组装标准,该组装具有高度的连续性(N50 = 675 kb;最大重叠群 = 4.3 Mb)。然后,我们将我们的组装作为宿主植物特异性转录反应的基因组引导差异表达研究的基础。我们发现数千个基因(n = 4982)在以茄科植物为食的个体的新数据与以茄科植物为食的个体的现有RNA测序数据之间存在差异表达。我们将其中许多基因鉴定为先前记录的解毒基因,如谷胱甘肽 - S - 转移酶、细胞色素P450和UDP - 葡萄糖基转移酶。我们的结果共同表明,混合样本的长读长测序可以为小型昆虫提供一种经济高效的基因组组装选择,并可以深入了解植物与食草害虫相互作用的遗传机制。