一种用于对博物馆馆藏基因组草图中的线粒体基因组和核糖体基因进行批量组装、注释及系统发育分析的Snakemake工具包。

A Snakemake Toolkit for the Batch Assembly, Annotation and Phylogenetic Analysis of Mitochondrial Genomes and Ribosomal Genes From Genome Skims of Museum Collections.

作者信息

White Oliver W, Hall Andie, Price Ben W, Williams Suzanne T, Clark Matthew D

机构信息

The Natural History Museum, London, UK.

出版信息

Mol Ecol Resour. 2025 Jan;25(1):e14036. doi: 10.1111/1755-0998.14036. Epub 2024 Oct 28.

DOI:10.1111/1755-0998.14036

PMID:39465511

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11646300/

Abstract

Low coverage 'genome-skims' are often used to assemble organelle genomes and ribosomal gene sequences for cost-effective phylogenetic and barcoding studies. Natural history collections hold invaluable biological information, yet poor preservation resulting in degraded DNA often hinders polymerase chain reaction-based analyses. However, it is possible to generate libraries and sequence the short fragments typical of degraded DNA to generate genome-skims from museum collections. Here we introduce a snakemake toolkit comprised of three pipelines skim2mito, skim2rrna and gene2phylo, designed to unlock the genomic potential of historical museum specimens using genome skimming. Specifically, skim2mito and skim2rrna perform the batch assembly, annotation and phylogenetic analysis of mitochondrial genomes and nuclear ribosomal genes, respectively, from low-coverage genome skims. The third pipeline gene2phylo takes a set of gene alignments and performs phylogenetic analysis of individual genes, partitioned analysis of concatenated alignments and a phylogenetic analysis based on gene trees. We benchmark our pipelines with simulated data, followed by testing with a novel genome skimming dataset from both recent and historical solariellid gastropod samples. We show that the toolkit can recover mitochondrial and ribosomal genes from poorly preserved museum specimens of the gastropod family Solariellidae, and the phylogenetic analysis is consistent with our current understanding of taxonomic relationships. The generation of bioinformatic pipelines that facilitate processing large quantities of sequence data from the vast repository of specimens held in natural history museum collections will greatly aid species discovery and exploration of biodiversity over time, ultimately aiding conservation efforts in the face of a changing planet.

摘要

低覆盖度的“基因组扫描”通常用于组装细胞器基因组和核糖体基因序列，以进行经济高效的系统发育和条形码研究。自然历史标本馆保存着极为珍贵的生物信息，但保存不佳导致DNA降解，常常阻碍基于聚合酶链式反应的分析。然而，通过构建文库并对降解DNA典型的短片段进行测序，从而从博物馆标本中生成基因组扫描数据是可行的。在此，我们介绍一个由skim2mito、skim2rrna和gene2phylo三个流程组成的Snakemake工具包，旨在利用基因组扫描发掘历史博物馆标本的基因组潜力。具体而言，skim2mito和skim2rrna分别对低覆盖度基因组扫描数据进行线粒体基因组和核糖体基因的批量组装、注释及系统发育分析。第三个流程gene2phylo则采用一组基因比对数据，对单个基因进行系统发育分析、对串联比对数据进行分区分析以及基于基因树进行系统发育分析。我们先用模拟数据对我们的流程进行基准测试，随后用来自近期和历史太阳螺科腹足类样本的全新基因组扫描数据集进行测试。我们表明，该工具包能够从保存不佳的太阳螺科博物馆标本中恢复线粒体和核糖体基因，且系统发育分析结果与我们目前对分类关系的理解一致。生成有助于处理自然历史标本馆收藏的大量标本库中序列数据的生物信息学流程，将极大地助力物种发现以及长期的生物多样性探索，最终在面对不断变化的地球时有助于保护工作。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a4fc/11646300/b458e61a5c88/MEN-25-e14036-g004.jpg

相似文献

A Snakemake Toolkit for the Batch Assembly, Annotation and Phylogenetic Analysis of Mitochondrial Genomes and Ribosomal Genes From Genome Skims of Museum Collections.一种用于对博物馆馆藏基因组草图中的线粒体基因组和核糖体基因进行批量组装、注释及系统发育分析的Snakemake工具包。

Mol Ecol Resour. 2025 Jan;25(1):e14036. doi: 10.1111/1755-0998.14036. Epub 2024 Oct 28.

Skimming for barcodes: rapid production of mitochondrial genome and nuclear ribosomal repeat reference markers through shallow shotgun sequencing.条码扫描：通过浅层鸟枪法测序快速生成线粒体基因组和核核糖体重复参考标记。

PeerJ. 2022 Aug 5;10:e13790. doi: 10.7717/peerj.13790. eCollection 2022.

Sequence capture phylogenomics of historical ethanol-preserved museum specimens: Unlocking the rest of the vault.历史乙醇保存的博物馆标本的序列捕获系统发育基因组学：开启保险库的其余部分。

Mol Ecol Resour. 2019 Nov;19(6):1531-1544. doi: 10.1111/1755-0998.13072. Epub 2019 Sep 18.

Museomics allows comparative analyses of mitochondrial genomes in the family Gryllidae (Insecta, Orthoptera) and confirms its phylogenetic relationships.缪斯基因组学允许对蟋蟀科（昆虫纲，直翅目）的线粒体基因组进行比较分析，并证实其系统发育关系。

PeerJ. 2024 Aug 8;12:e17734. doi: 10.7717/peerj.17734. eCollection 2024.

A pipeline for assembling low copy nuclear markers from plant genome skimming data for phylogenetic use.用于组装植物基因组刮削数据中低拷贝核标记的流水线，以便进行系统发育分析。

PeerJ. 2022 Dec 6;10:e14525. doi: 10.7717/peerj.14525. eCollection 2022.

Testing Efficacy of Assembly-Free and Alignment-Free Methods for Species Identification Using Genome Skims, with Patellogastropoda as a Test Case.利用基因组草图，免组装和免比对方法对物种鉴定的功效测试，以帽贝形腹足纲软体动物作为测试案例。

Genes (Basel). 2022 Jul 2;13(7):1192. doi: 10.3390/genes13071192.

Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.基因组宝库：植物标本馆和昆虫博物馆标本的全基因组测序。

PLoS One. 2013 Jul 29;8(7):e69189. doi: 10.1371/journal.pone.0069189. Print 2013.

Fixing Formalin: A Method to Recover Genomic-Scale DNA Sequence Data from Formalin-Fixed Museum Specimens Using High-Throughput Sequencing.固定福尔马林：一种利用高通量测序从福尔马林固定的博物馆标本中恢复基因组规模DNA序列数据的方法。

PLoS One. 2015 Oct 27;10(10):e0141579. doi: 10.1371/journal.pone.0141579. eCollection 2015.

Complete mitochondrial genomes of eleven extinct or possibly extinct bird species.十一种已灭绝或可能已灭绝鸟类物种的完整线粒体基因组

Mol Ecol Resour. 2017 Mar;17(2):334-341. doi: 10.1111/1755-0998.12600. Epub 2016 Oct 11.

Ten new complete mitochondrial genomes of pulmonates (Mollusca: Gastropoda) and their impact on phylogenetic relationships.10 个新的软体动物门（腹足纲：软体动物）的完整线粒体基因组及其对系统发育关系的影响。

BMC Evol Biol. 2011 Oct 10;11:295. doi: 10.1186/1471-2148-11-295.

引用本文的文献

A composite universal DNA signature for the tree of life.一种用于生命之树的复合通用DNA特征。

Nat Ecol Evol. 2025 Jun 25. doi: 10.1038/s41559-025-02752-1.

The mitochondrial genomes of Iberian freshwater and diadromous fishes.伊比利亚淡水鱼和溯河洄游鱼类的线粒体基因组。

Sci Data. 2025 Feb 27;12(1):349. doi: 10.1038/s41597-024-04297-7.

本文引用的文献

A global approach for natural history museum collections.全球自然历史博物馆藏品策略。

Science. 2023 Mar 24;379(6638):1192-1194. doi: 10.1126/science.adf6434. Epub 2023 Mar 23.

Plastaumatic: Automating plastome assembly and annotation.Plastaumatic：叶绿体基因组组装与注释自动化

Front Plant Sci. 2022 Nov 3;13:1011948. doi: 10.3389/fpls.2022.1011948. eCollection 2022.

Evolutionary loss of shell pigmentation, pattern, and eye structure in deep-sea snails in the dysphotic zone.深海蜗牛在弱光区中壳色素、图案和眼部结构的进化丧失。

Evolution. 2022 Dec;76(12):3026-3040. doi: 10.1111/evo.14647. Epub 2022 Oct 25.

PhyloHerb: A high-throughput phylogenomic pipeline for processing genome skimming data.PhyloHerb：一种用于处理基因组浅层测序数据的高通量系统发育基因组学流程。

Appl Plant Sci. 2022 Jun 2;10(3):e11475. doi: 10.1002/aps3.11475. eCollection 2022 May-Jun.

The complete mitochondrial genome of (Trochida: Turbinidae) and its phylogeny analysis.（马蹄螺科：钟螺科）的完整线粒体基因组及其系统发育分析。

Mitochondrial DNA B Resour. 2022 Apr 8;7(4):637-639. doi: 10.1080/23802359.2022.2060764. eCollection 2022.

ORTHOSKIM: In silico sequence capture from genomic and transcriptomic libraries for phylogenomic and barcoding applications.ORTHOSKIM：用于系统发育基因组学和条形码应用的从基因组和转录组文库中进行的计算机辅助序列捕获。

Mol Ecol Resour. 2022 Jul;22(5):2018-2037. doi: 10.1111/1755-0998.13584. Epub 2022 Jan 30.

Estimating repeat spectra and genome length from low-coverage genome skims with RESPECT.使用 RESPECT 从低覆盖度基因组草图估算重复谱和基因组长度。

PLoS Comput Biol. 2021 Nov 15;17(11):e1009449. doi: 10.1371/journal.pcbi.1009449. eCollection 2021 Nov.

Successful application of ancient DNA extraction and library construction protocols to museum wet collection specimens.成功应用古 DNA 提取和文库构建方案于博物馆湿采集标本。

Mol Ecol Resour. 2021 Oct;21(7):2299-2315. doi: 10.1111/1755-0998.13433. Epub 2021 Jun 16.

NOVOWrap: An automated solution for plastid genome assembly and structure standardization.NOVOWrap：一种用于质体基因组组装和结构标准化的自动化解决方案。

Mol Ecol Resour. 2021 Aug;21(6):2177-2186. doi: 10.1111/1755-0998.13410. Epub 2021 May 25.

A SMRT approach for targeted amplicon sequencing of museum specimens (Lepidoptera)-patterns of nucleotide misincorporation.一种用于博物馆标本（鳞翅目）靶向扩增子测序的单分子实时（SMRT）方法——核苷酸错掺入模式

PeerJ. 2021 Jan 14;9:e10420. doi: 10.7717/peerj.10420. eCollection 2021.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种用于对博物馆馆藏基因组草图中的线粒体基因组和核糖体基因进行批量组装、注释及系统发育分析的Snakemake工具包。

A Snakemake Toolkit for the Batch Assembly, Annotation and Phylogenetic Analysis of Mitochondrial Genomes and Ribosomal Genes From Genome Skims of Museum Collections.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献