Suppr超能文献

通过 DNA metabarcoding 为昆虫鉴定编制 COI 序列参考数据库:COins。

Curation of a reference database of COI sequences for insect identification through DNA metabarcoding: COins.

机构信息

Department of Agricultural and Environmental Sciences, University of Milan, Via Celoria 2, Milano 20133, Italy.

Department of Biology and Biotechnology 'Charles Darwin', Sapienza University of Rome, Viale dell'Università 32, Rome 00185, Italy.

出版信息

Database (Oxford). 2022 Jul 6;2022. doi: 10.1093/database/baac055.

Abstract

DNA metabarcoding is a widespread approach for the molecular identification of organisms. While the associated wet-lab and data processing procedures are well established and highly efficient, the reference databases for taxonomic assignment can be implemented to improve the accuracy of identifications. Insects are among the organisms for which DNA-based identification is most commonly used; yet, a DNA-metabarcoding reference database specifically curated for their species identification using software requiring local databases is lacking. Here, we present COins, a database of 5' region cytochrome c oxidase subunit I sequences (COI-5P) of insects that includes over 532 000 representative sequences of >106 000 species specifically formatted for the QIIME2 software platform. Through a combination of automated and manually curated steps, we developed this database starting from all COI sequences available in the Barcode of Life Data System for insects, focusing on sequences that comply with several standards, including a species-level identification. COins was validated on previously published DNA-metabarcoding sequences data (bulk samples from Malaise traps) and its efficiency compared with other publicly available reference databases (not specific for insects). COins can allow an increase of up to 30% of species-level identifications and thus can represent a valuable resource for the taxonomic assignment of insects' DNA-metabarcoding data, especially when species-level identification is needed https://doi.org/10.6084/m9.figshare.19130465.v1.

摘要

DNA 代谢条形码是一种广泛用于生物分子鉴定的方法。虽然相关的湿实验室和数据处理程序已经非常成熟且高效,但分类分配的参考数据库可以被实施以提高鉴定的准确性。昆虫是最常用于基于 DNA 的鉴定的生物之一;然而,缺乏专门针对其物种鉴定的基于 DNA 代谢条形码的参考数据库,该数据库使用需要本地数据库的软件进行管理。在这里,我们介绍了 COins,这是一个昆虫 5' 区细胞色素 c 氧化酶亚基 I 序列(COI-5P)的数据库,其中包含超过 532,000 个代表超过 106,000 个物种的序列,这些序列经过专门格式化,可用于 QIIME2 软件平台。通过自动化和手动编辑步骤的组合,我们从昆虫的生命条形码数据系统中所有 COI 序列开始开发了这个数据库,重点是符合几个标准的序列,包括物种级别的鉴定。我们在以前发表的 DNA 代谢条形码序列数据(粘虫陷阱的混合样本)上验证了 COins,并将其与其他公开可用的参考数据库(非专门针对昆虫)的效率进行了比较。COins 可以将物种级别的鉴定增加高达 30%,因此可以成为昆虫 DNA 代谢条形码数据分类分配的有价值资源,特别是在需要进行物种级别的鉴定时。https://doi.org/10.6084/m9.figshare.19130465.v1.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c956/9261288/b6d10d6fab5d/baac055f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验