物种组（双翅目：果蝇科）23个物种的全基因组序列：用于检验进化假说的资源。

Whole Genome Sequences of 23 Species from the Species Group (Diptera: Drosophilidae): A Resource for Testing Evolutionary Hypotheses.

作者信息

Bronski Michael J, Martinez Ciera C, Weld Holli A, Eisen Michael B

机构信息

Department of Molecular and Cell Biology, University of California, Berkeley

Department of Molecular and Cell Biology, University of California, Berkeley.

出版信息

G3 (Bethesda). 2020 May 4;10(5):1443-1455. doi: 10.1534/g3.119.400959.

DOI:10.1534/g3.119.400959

PMID:32220952

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7202002/

Abstract

Large groups of species with well-defined phylogenies are excellent systems for testing evolutionary hypotheses. In this paper, we describe the creation of a comparative genomic resource consisting of 23 genomes from the species-rich species group, 22 of which are presented here for the first time. The group is well-positioned for clade genomics. Within the clade, evolutionary distances are such that large numbers of sequences can be accurately aligned while also recovering strong signals of divergence; and the distance between the group and is short enough so that orthologous sequence can be readily identified. All genomes were assembled from a single, small-insert library using MaSuRCA, before going through an extensive post-assembly pipeline. Estimated genome sizes within the group range from 155 Mb to 223 Mb (mean = 196 Mb). The absence of long-distance information during the assembly process resulted in fragmented assemblies, with the scaffold NG50s varying widely based on repeat content and sample heterozygosity (min = 18 kb, max = 390 kb, mean = 74 kb). The total scaffold length for most assemblies is also shorter than the estimated genome size, typically by 5-15%. However, subsequent analysis showed that our assemblies are highly complete. Despite large differences in contiguity, all assemblies contain at least 96% of known single-copy Dipteran genes (BUSCOs, n = 2,799). Similarly, by aligning our assemblies to the genome and remapping coordinates for a large set of transcriptional enhancers (n = 3,457), we showed that each assembly contains orthologs for at least 91% of enhancers. Importantly, the genic and enhancer contents of our assemblies are comparable to that of far more contiguous assemblies. The alignment of our own assembly to a previously published PacBio assembly also showed that our longest scaffolds (up to 1 Mb) are free of large-scale misassemblies. Our genome assemblies are a valuable resource that can be used to further resolve the group phylogeny; study the evolution of protein-coding genes and -regulatory sequences; and determine the genetic basis of ecological and behavioral adaptations.

摘要

具有明确系统发育关系的大量物种群体是检验进化假说的优秀系统。在本文中，我们描述了一个比较基因组资源的创建，该资源由来自物种丰富的物种群体的23个基因组组成，其中22个在此首次呈现。该群体非常适合进行分支基因组学研究。在该进化枝内，进化距离使得大量序列能够准确比对，同时还能恢复强烈的分化信号；并且该群体与其他群体之间的距离足够短，以便能够轻松识别直系同源序列。所有基因组都是使用MaSuRCA从单个小插入片段文库组装而成，然后经过广泛的组装后流程。该群体内估计的基因组大小范围为155 Mb至223 Mb（平均 = 196 Mb）。组装过程中缺乏长距离信息导致组装片段化，支架NG50s因重复序列含量和样本杂合性而差异很大（最小值 = 18 kb，最大值 = 390 kb，平均值 = 74 kb）。大多数组装的总支架长度也比估计的基因组大小短，通常短5 - 15%。然而，后续分析表明我们的组装非常完整。尽管连续性差异很大，但所有组装都包含至少96%的已知单拷贝双翅目基因（BUSCOs，n = 2799）。同样，通过将我们的组装与另一个基因组比对并重新映射大量转录增强子（n = 3457）的坐标，我们表明每个组装都包含至少91%的该增强子的直系同源物。重要的是，我们组装的基因和增强子含量与连续性高得多的其他组装相当。将我们自己的组装与先前发表的PacBio组装比对也表明，我们最长的支架（长达1 Mb）没有大规模的错误组装。我们的基因组组装是一种有价值的资源，可用于进一步解析该群体的系统发育；研究蛋白质编码基因和调控序列的进化；以及确定生态和行为适应的遗传基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d1c0/7202002/9d2464517002/1443f1.jpg

相似文献

Whole Genome Sequences of 23 Species from the Species Group (Diptera: Drosophilidae): A Resource for Testing Evolutionary Hypotheses.物种组（双翅目：果蝇科）23个物种的全基因组序列：用于检验进化假说的资源。

G3 (Bethesda). 2020 May 4;10(5):1443-1455. doi: 10.1534/g3.119.400959.

A phylogeny for the Drosophila montium species group: A model clade for comparative analyses.蒙氏果蝇物种组的系统发育：用于比较分析的模型分支。

Mol Phylogenet Evol. 2021 May;158:107061. doi: 10.1016/j.ympev.2020.107061. Epub 2020 Dec 31.

Highly contiguous assemblies of 101 drosophilid genomes.101 种果蝇基因组的高连续组装。

Elife. 2021 Jul 19;10:e66405. doi: 10.7554/eLife.66405.

Highly Contiguous Genome Assemblies of 15 Species Generated Using Nanopore Sequencing.使用纳米孔测序生成的15个物种的高度连续基因组组装

G3 (Bethesda). 2018 Oct 3;8(10):3131-3141. doi: 10.1534/g3.118.200160.

Single-fly genome assemblies fill major phylogenomic gaps across the Drosophilidae Tree of Life.单蝇基因组组装填补了果蝇科生命之树的主要系统发育基因组空白。

PLoS Biol. 2024 Jul 18;22(7):e3002697. doi: 10.1371/journal.pbio.3002697. eCollection 2024 Jul.

An expressed sequence tag (EST) library for Drosophila serrata, a model system for sexual selection and climatic adaptation studies.一个用于锯缘果蝇的表达序列标签（EST）文库，锯缘果蝇是性选择和气候适应性研究的模型系统。

BMC Genomics. 2009 Jan 21;10:40. doi: 10.1186/1471-2164-10-40.

flyDIVaS: A Comparative Genomics Resource for Drosophila Divergence and Selection.flyDIVaS：果蝇分化与选择的比较基因组学资源

G3 (Bethesda). 2016 Aug 9;6(8):2355-63. doi: 10.1534/g3.116.031138.

The glutamate dehydrogenase, E74 and putative actin gene loci in the Drosophila montium subgroup. Chromosomal homologies among the montium species and D. melanogaster.山地果蝇亚组中的谷氨酸脱氢酶、E74和假定的肌动蛋白基因位点。山地果蝇物种与黑腹果蝇之间的染色体同源性。

Chromosoma. 1997 Jun;106(1):20-8. doi: 10.1007/s004120050220.

The heat shock genes in the Drosophila montium subgroup: chromosomal localization and evolutionary implications.山地果蝇亚组中的热休克基因：染色体定位及进化意义

Chromosoma. 1996 Aug;105(2):104-10. doi: 10.1007/BF02509520.

Variations in the heat-induced protein pattern of several Drosophila montium subgroup species (Diptera:Drosophilidae).几种山地果蝇亚组物种（双翅目：果蝇科）热诱导蛋白模式的变化。

Genome. 1997 Feb;40(1):132-7. doi: 10.1139/g97-019.

引用本文的文献

Phylogenomic Analysis Reveals Evolutionary Relationships of Tropical Drosophilidae: From to .系统基因组学分析揭示了热带果蝇科的进化关系：从到。（原文中“From to.”部分内容缺失，无法完整准确翻译这部分）

Ecol Evol. 2025 Mar 10;15(3):e71100. doi: 10.1002/ece3.71100. eCollection 2025 Mar.

Evolutionary diversification reveals distinct somatic versus germline cytoskeletal functions of the Arp2 branched actin nucleator protein.进化多样化揭示了 Arp2 分支肌动蛋白成核蛋白在体细胞与生殖细胞骨架中的不同功能。

Curr Biol. 2023 Dec 18;33(24):5326-5339.e7. doi: 10.1016/j.cub.2023.10.055. Epub 2023 Nov 16.

In Silico Identification and Characterization of Satellite DNAs in 23 Species from the Group.利用计算机鉴定和分析组内 23 个物种中的卫星 DNA

Genes (Basel). 2023 Jan 23;14(2):300. doi: 10.3390/genes14020300.

Expansion and loss of sperm nuclear basic protein genes in correspond with genetic conflicts between sex chromosomes.精子核碱性蛋白基因的扩张和丢失与性染色体之间的遗传冲突有关。

Elife. 2023 Feb 10;12:e85249. doi: 10.7554/eLife.85249.

transcript levels largely explain cytoplasmic incompatibility variation across divergent .转录水平在很大程度上解释了不同群体间细胞质不相容性的差异。

PNAS Nexus. 2022 Jun 28;1(3):pgac099. doi: 10.1093/pnasnexus/pgac099. eCollection 2022 Jul.

Cytological heterogeneity of heterochromatin among 10 sequenced Drosophila species.10 个已测序的果蝇物种中异染色质的细胞学异质性。

Genetics. 2022 Sep 30;222(2). doi: 10.1093/genetics/iyac119.

Dissecting the evolutionary role of the gene in mouthpart diversification by full locus replacement.通过完整基因座替换剖析该基因在口器多样化中的进化作用。

Sci Adv. 2021 Nov 12;7(46):eabk1003. doi: 10.1126/sciadv.abk1003. Epub 2021 Nov 10.

Highly contiguous assemblies of 101 drosophilid genomes.101 种果蝇基因组的高连续组装。

Elife. 2021 Jul 19;10:e66405. doi: 10.7554/eLife.66405.

A phylogeny for the Drosophila montium species group: A model clade for comparative analyses.蒙氏果蝇物种组的系统发育：用于比较分析的模型分支。

Mol Phylogenet Evol. 2021 May;158:107061. doi: 10.1016/j.ympev.2020.107061. Epub 2020 Dec 31.

本文引用的文献

Evolution and diversity of the courtship repertoire in the Drosophila montium species group (Diptera: Drosophilidae).蒙氏果蝇物种组（双翅目：果蝇科）求偶行为模式的进化与多样性

J Evol Biol. 2019 Oct;32(10):1124-1140. doi: 10.1111/jeb.13515. Epub 2019 Aug 22.

Phylogenetic position of the Drosophila fima and dentissima lineages, and the status of the D. melanogaster species group.果蝇 fima 和 dentissima 谱系的系统发育位置，以及 D. melanogaster 物种群的地位。

Mol Phylogenet Evol. 2019 Oct;139:106543. doi: 10.1016/j.ympev.2019.106543. Epub 2019 Jun 24.

OrthoDB v10: sampling the diversity of animal, plant, fungal, protist, bacterial and viral genomes for evolutionary and functional annotations of orthologs.OrthoDB v10：从动物、植物、真菌、原生生物、细菌和病毒基因组中采样，以进行同源基因的进化和功能注释。

Nucleic Acids Res. 2019 Jan 8;47(D1):D807-D811. doi: 10.1093/nar/gky1053.

FlyBase 2.0: the next generation.FlyBase 2.0：下一代。

Nucleic Acids Res. 2019 Jan 8;47(D1):D759-D765. doi: 10.1093/nar/gky1003.

Ten steps to get started in Genome Assembly and Annotation.基因组组装与注释入门的十个步骤。

F1000Res. 2018 Feb 5;7. doi: 10.12688/f1000research.13598.1. eCollection 2018.

BUSCO Applications from Quality Assessments to Gene Prediction and Phylogenomics.BUSCO的应用：从质量评估到基因预测和系统发育基因组学

Mol Biol Evol. 2018 Mar 1;35(3):543-548. doi: 10.1093/molbev/msx319.

Database resources of the National Center for Biotechnology Information.国家生物技术信息中心数据库资源。

Nucleic Acids Res. 2018 Jan 4;46(D1):D8-D13. doi: 10.1093/nar/gkx1095.

Single-Molecule Sequencing of the Genome.基因组的单分子测序

G3 (Bethesda). 2017 Mar 10;7(3):781-788. doi: 10.1534/g3.116.037598.

The pdm3 Locus Is a Hotspot for Recurrent Evolution of Female-Limited Color Dimorphism in Drosophila.pdm3基因座是果蝇雌性特异性颜色二态性反复进化的热点。

Curr Biol. 2016 Sep 26;26(18):2412-2422. doi: 10.1016/j.cub.2016.07.016. Epub 2016 Aug 18.

Redundans: an assembly pipeline for highly heterozygous genomes.Redundans：一种用于高度杂合基因组的组装管道。

Nucleic Acids Res. 2016 Jul 8;44(12):e113. doi: 10.1093/nar/gkw294. Epub 2016 Apr 29.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

物种组（双翅目：果蝇科）23个物种的全基因组序列：用于检验进化假说的资源。

Whole Genome Sequences of 23 Species from the Species Group (Diptera: Drosophilidae): A Resource for Testing Evolutionary Hypotheses.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献