Suppr超能文献

单果蝇组装填补了果蝇科生命树中主要的系统发育基因组学空白。

Single-fly assemblies fill major phylogenomic gaps across the Drosophilidae Tree of Life.

作者信息

Kim Bernard Y, Gellert Hannah R, Church Samuel H, Suvorov Anton, Anderson Sean S, Barmina Olga, Beskid Sofia G, Comeault Aaron A, Crown K Nicole, Diamond Sarah E, Dorus Steve, Fujichika Takako, Hemker James A, Hrcek Jan, Kankare Maaria, Katoh Toru, Magnacca Karl N, Martin Ryan A, Matsunaga Teruyuki, Medeiros Matthew J, Miller Danny E, Pitnick Scott, Simoni Sara, Steenwinkel Tessa E, Schiffer Michele, Syed Zeeshan A, Takahashi Aya, Wei Kevin H-C, Yokoyama Tsuya, Eisen Michael B, Kopp Artyom, Matute Daniel, Obbard Darren J, O'Grady Patrick M, Price Donald K, Toda Masanori J, Werner Thomas, Petrov Dmitri A

机构信息

Department of Biology, Stanford University, USA.

Department of Ecology and Evolutionary Biology, Yale University, USA.

出版信息

bioRxiv. 2023 Oct 2:2023.10.02.560517. doi: 10.1101/2023.10.02.560517.

Abstract

Long-read sequencing is driving rapid progress in genome assembly across all major groups of life, including species of the family Drosophilidae, a longtime model system for genetics, genomics, and evolution. We previously developed a cost-effective hybrid Oxford Nanopore (ONT) long-read and Illumina short-read sequencing approach and used it to assemble 101 drosophilid genomes from laboratory cultures, greatly increasing the number of genome assemblies for this taxonomic group. The next major challenge is to address the laboratory culture bias in taxon sampling by sequencing genomes of species that cannot easily be reared in the lab. Here, we build upon our previous methods to perform amplification-free ONT sequencing of single wild flies obtained either directly from the field or from ethanol-preserved specimens in museum collections, greatly improving the representation of lesser studied drosophilid taxa in whole-genome data. Using Illumina Novaseq X Plus and ONT P2 sequencers with R10.4.1 chemistry, we set a new benchmark for inexpensive hybrid genome assembly at US $150 per genome while assembling genomes from as little as 35 ng of genomic DNA from a single fly. We present 183 new genome assemblies for 179 species as a resource for drosophilid systematics, phylogenetics, and comparative genomics. Of these genomes, 62 are from pooled lab strains and 121 from single adult flies. Despite the sample limitations of working with small insects, most single-fly diploid assemblies are comparable in contiguity (>1Mb contig N50), completeness (>98% complete dipteran BUSCOs), and accuracy (>QV40 genome-wide with ONT R10.4.1) to assemblies from inbred lines. We present a well-resolved multi-locus phylogeny for 360 drosophilid and 4 outgroup species encompassing all publicly available (as of August 2023) genomes for this group. Finally, we present a Progressive Cactus whole-genome, reference-free alignment built from a subset of 298 suitably high-quality drosophilid genomes. The new assemblies and alignment, along with updated laboratory protocols and computational pipelines, are released as an open resource and as a tool for studying evolution at the scale of an entire insect family.

摘要

长读长测序正在推动所有主要生物类群的基因组组装取得快速进展,包括果蝇科的物种,果蝇科长期以来一直是遗传学、基因组学和进化研究的模式系统。我们之前开发了一种经济高效的牛津纳米孔(ONT)长读长和Illumina短读长测序的混合方法,并使用该方法从实验室培养物中组装了101个果蝇基因组,极大地增加了这个分类群的基因组组装数量。下一个主要挑战是通过对难以在实验室饲养的物种的基因组进行测序来解决分类群采样中的实验室培养偏差问题。在这里,我们基于之前的方法,对直接从野外采集或从博物馆馆藏中用乙醇保存的标本中获得的单个野生果蝇进行无扩增ONT测序,极大地提高了全基因组数据中研究较少的果蝇分类群的代表性。使用配备R10.4.1化学试剂的Illumina Novaseq X Plus和ONT P2测序仪,我们设定了一个新的低成本混合基因组组装基准,每个基因组成本为150美元,同时可以从仅35纳克单个果蝇的基因组DNA中组装基因组。我们展示了179个物种的183个新基因组组装结果,作为果蝇系统学、系统发育学和比较基因组学的资源。在这些基因组中,62个来自混合实验室菌株,121个来自单个成年果蝇。尽管处理小昆虫样本存在局限性,但大多数单个果蝇二倍体组装在连续性(重叠群N50>1兆碱基)、完整性(>98%完整的双翅目BUSCOs)和准确性(使用ONT R10.4.1全基因组>QV40)方面与近交系的组装结果相当。我们展示了一个解析度良好的多位点系统发育树,涵盖360个果蝇物种和4个外群物种,包含该类群所有公开可用的(截至2023年8月)基因组。最后,我们展示了一个基于298个质量合适的高质量果蝇基因组子集构建的渐进仙人掌全基因组、无参考比对。新的组装结果和比对,以及更新的实验室协议和计算流程,作为开放资源和研究整个昆虫家族进化规模的工具发布。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a41b/10592941/19a199920f87/nihpp-2023.10.02.560517v1-f0001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验