Department of Biology, Hopkins Marine Station, Stanford University, 120 Ocean View Blvd, CA-93950 Pacific Grove, USA.
Department of Marine Sciences, University of Connecticut, 1080 Shennecossett Road, CT-06340 Groton, USA.
Mar Genomics. 2020 Oct;53:100738. doi: 10.1016/j.margen.2019.100738. Epub 2020 Jan 25.
The Atlantic silverside (Menidia menidia) has been the focus of extensive research efforts in ecology, evolutionary biology, and physiology over the past three decades, but lack of genomic resources has so far hindered examination of the molecular basis underlying the remarkable patterns of phenotypic variation described in this species. We here present the first reference transcriptome for M. menidia. We sought to capture a single representative sequence from as many genes as possible by first using a combination of Trinity and the CLC Genomics Workbench to de novo assemble contigs based on RNA-seq data from multiple individuals, tissue types, and life stages. To reduce redundancy, we passed the combined raw assemblies through a stringent filtering pipeline based both on sequence similarity to related species and computational predictions of transcript quality, condensing an initial set of >480,000 contigs to a final set of 20,998 representative contigs, amounting to a total length of 53.3 Mb. In this final assembly, 91% of the contigs were functionally annotated with putative gene function and gene ontology (GO) terms and/or InterProScan identifiers. The assembly contains complete or nearly complete copies of >95% of 248 highly conserved core genes present in low copy number across higher eukaryotes, and partial copies of another 3.8%, suggesting that our assembly provides relatively comprehensive coverage of the M. menidia transcriptome. The assembly provided here will be an important resource for future research.
大西洋鲱(Menidia menidia)在过去三十年中一直是生态学、进化生物学和生理学研究的重点,但缺乏基因组资源一直阻碍着对该物种中描述的显著表型变异的分子基础的研究。我们在这里展示了第一个 M. menidia 的参考转录组。我们首先使用 Trinity 和 CLC Genomics Workbench 的组合,根据来自多个个体、组织类型和生命阶段的 RNA-seq 数据从头组装 contigs,以尽可能多地从许多基因中捕获单个代表序列。为了减少冗余,我们将组合的原始组装通过基于序列与相关物种的相似性和转录质量的计算预测的严格过滤管道进行过滤,将最初的>480,000 个 contigs 压缩到最终的 20,998 个代表 contigs,总长度为 53.3 Mb。在这个最终的组装中,91%的 contigs具有推定的基因功能和基因本体 (GO) 术语和/或 InterProScan 标识符的功能注释。该组装包含>95%的 248 个高度保守核心基因的完整或几乎完整拷贝,这些基因在高等真核生物中数量较少,另外还有 3.8%的部分拷贝,这表明我们的组装提供了 M. menidia 转录组的相对全面的覆盖。这里提供的组装将成为未来研究的重要资源。