Suppr超能文献

用于环境样本分类学鉴定的后生动物线粒体基因序列参考数据集。

Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples.

机构信息

Biodiversity Research Centre, Academia Sinica, Taipei 11529, Taiwan.

Smithsonian Tropical Research Institute, Panama City, Republic of Panama.

出版信息

Sci Data. 2017 Mar 14;4:170027. doi: 10.1038/sdata.2017.27.

Abstract

Mitochondrial-encoded genes are increasingly targeted in studies using high-throughput sequencing approaches for characterizing metazoan communities from environmental samples (e.g., plankton, meiofauna, filtered water). Yet, unlike nuclear ribosomal RNA markers, there is to date no high-quality reference dataset available for taxonomic assignments. Here, we retrieved all metazoan mitochondrial gene sequences from GenBank, and then quality filtered and formatted the datasets for taxonomic assignments using taxonomic assignment tools. The reference datasets-'Midori references'-are available for download at www.reference-midori.info. Two versions are provided: (I) Midori-UNIQUE that contains all unique haplotypes associated with each species and (II) Midori-LONGEST that contains a single sequence, the longest, for each species. Overall, the mitochondrial Cytochrome oxidase subunit I gene was the most sequence-rich gene. However, sequences of the mitochondrial large ribosomal subunit RNA and Cytochrome b apoenzyme genes were observed for a large number of species in some phyla. The Midori reference is compatible with some taxonomic assignment software. Therefore, automated high-throughput sequence taxonomic assignments can be particularly effective using these datasets.

摘要

线粒体编码基因越来越多地成为使用高通量测序方法来描述环境样本(如浮游生物、小型后生动物、过滤水)中的后生动物群落的研究目标。然而,与核核糖体 RNA 标记不同,迄今为止,尚无用于分类分配的高质量参考数据集。在这里,我们从 GenBank 中检索了所有后生动物的线粒体基因序列,然后使用分类分配工具对数据集进行质量过滤和格式化为分类分配。参考数据集-'Midori 参考'-可在 www.reference-midori.info 下载。提供了两个版本:(I)包含与每个物种相关的所有独特单倍型的 Midori-UNIQUE 和(II)包含每个物种的单个序列、最长的 Midori-LONGEST。总体而言,线粒体细胞色素氧化酶亚基 I 基因是序列最丰富的基因。然而,在一些门中,许多物种的线粒体大亚基 RNA 和细胞色素 b 脱辅基酶基因的序列都有观察到。Midori 参考与一些分类分配软件兼容。因此,使用这些数据集进行自动化高通量序列分类分配可能特别有效。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb9d/5349245/2ea01ba7a5c3/sdata201727-f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验