Suppr超能文献

一个经过整理的细菌和古菌16S rRNA基因口腔序列数据集。

A curated bacterial and archaeal 16S rRNA Gene Oral Sequences dataset.

作者信息

Vázquez-González Lara, Regueira-Iglesias Alba, Balsa-Castro Carlos, Tomás Inmaculada, Carreira María J

机构信息

Centro Singular de Investigación en Tecnoloxías Intelixentes (CiTIUS), Universidade de Santiago de Compostela, Rúa de Jenaro de la Fuente Domínguez, E15782, Santiago de Compostela, Spain.

Instituto de Investigación Sanitaria de Santiago de Compostela (IDIS), E15706, Santiago de Compostela, Spain.

出版信息

Sci Data. 2025 May 2;12(1):729. doi: 10.1038/s41597-025-05050-4.

Abstract

In a given species, genomes and 16S rRNA gene sequences, along with their intragenomic copy numbers, can vary greatly across environments. The gene copy numbers are crucial for technologies which estimate microbial abundances based on gene counts, such as polymerase chain reaction and high-throughput sequencing. In these, taxa with fewer genes may be underestimated, while those with more genes might be overestimated. Therefore, it is essential to have accurate gene copy number databases specific to the niche under study. The 16S rRNA Gene Oral Sequences dataset (16SGOSeq) contains the number of 16S rRNA genes and their variants in the complete genomes of the bacterial and archaeal species present in the human oral cavity. It includes 3,192 complete genomes of oral bacteria and 191 complete genomes of oral archaea, from which the 16S rRNA gene sequences were extracted, and the sequence variants were identified. This oral-specific dataset of prokaryotic organisms and the pipeline followed for its construction can be applied by clinical microbiologists, bioinformaticians, or microbial ecologists in future microbiome research.

摘要

在特定物种中,基因组和16S rRNA基因序列及其基因组内的拷贝数,会因环境的不同而有很大差异。基因拷贝数对于基于基因计数来估算微生物丰度的技术至关重要,比如聚合酶链反应和高通量测序。在这些技术中,基因数量较少的分类群可能会被低估,而基因数量较多的分类群可能会被高估。因此,拥有针对所研究生态位的准确基因拷贝数数据库至关重要。16S rRNA基因口腔序列数据集(16SGOSeq)包含了人类口腔中存在的细菌和古菌物种完整基因组中16S rRNA基因及其变体的数量。它包括3192个口腔细菌的完整基因组和191个口腔古菌的完整基因组,从中提取了16S rRNA基因序列,并鉴定了序列变体。这个原核生物的口腔特异性数据集及其构建流程,可供临床微生物学家、生物信息学家或微生物生态学家在未来的微生物组研究中使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/435b/12048654/02641770a661/41597_2025_5050_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验