• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

针对不同长度扩增子的信息尺度校正可改善真核微生物组数据整合

Information Scale Correction for Varying Length Amplicons Improves Eukaryotic Microbiome Data Integration.

作者信息

Zhou Tong, Zhao Feng, Xu Kuidong

机构信息

Laboratory of Marine Organism Taxonomy and Phylogeny, Qingdao Key Laboratory of Marine Biodiversity and Conservation, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China.

Shandong Province Key Laboratory of Experimental Marine Biology, Institute of Oceanology, Chinese Academy of Sciences, Qingdao 266071, China.

出版信息

Microorganisms. 2023 Apr 6;11(4):949. doi: 10.3390/microorganisms11040949.

DOI:10.3390/microorganisms11040949
PMID:37110372
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10146031/
Abstract

The integration and reanalysis of big data provide valuable insights into microbiome studies. However, the significant difference in information scale between amplicon data poses a key challenge in data analysis. Therefore, reducing batch effects is crucial to enhance data integration for large-scale molecular ecology data. To achieve this, the information scale correction (ISC) step, involving cutting different length amplicons into the same sub-region, is essential. In this study, we used the Hidden Markov model (HMM) method to extract 11 different 18S rRNA gene v4 region amplicon datasets with 578 samples in total. The length of the amplicons ranged from 344 bp to 720 bp, depending on the primer position. By comparing the information scale correction of amplicons with varying lengths, we explored the extent to which the comparability between samples decreases with increasing amplicon length. Our method was shown to be more sensitive than V-Xtractor, the most popular tool for performing ISC. We found that near-scale amplicons exhibited no significant change after ISC, while larger-scale amplicons exhibited significant changes. After the ISC treatment, the similarity among the data sets improved, especially for long amplicons. Therefore, we recommend adding ISC processing when integrating big data, which is crucial for unlocking the full potential of microbial community studies and advancing our knowledge of microbial ecology.

摘要

大数据的整合与重新分析为微生物组研究提供了有价值的见解。然而,扩增子数据之间信息规模的显著差异给数据分析带来了关键挑战。因此,减少批次效应对于增强大规模分子生态学数据的整合至关重要。要实现这一点,信息规模校正(ISC)步骤,即将不同长度的扩增子切割成相同的子区域,是必不可少的。在本研究中,我们使用隐马尔可夫模型(HMM)方法提取了总共578个样本的11个不同的18S rRNA基因v4区域扩增子数据集。扩增子的长度根据引物位置从344 bp到720 bp不等。通过比较不同长度扩增子的信息规模校正,我们探究了样本之间的可比性随扩增子长度增加而降低的程度。结果表明,我们的方法比执行ISC最常用的工具V-Xtractor更敏感。我们发现,接近规模的扩增子在ISC后没有显著变化,而规模较大的扩增子则有显著变化。经过ISC处理后,数据集之间的相似性提高了,尤其是对于长扩增子。因此,我们建议在整合大数据时添加ISC处理,这对于充分发挥微生物群落研究的潜力和推进我们对微生物生态学的认识至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb1a/10146031/f3fc4cef3de8/microorganisms-11-00949-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb1a/10146031/d9ff50e5d6af/microorganisms-11-00949-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb1a/10146031/f1aff0cb405a/microorganisms-11-00949-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb1a/10146031/2e7acc21a329/microorganisms-11-00949-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb1a/10146031/ec6d866912ac/microorganisms-11-00949-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb1a/10146031/f3fc4cef3de8/microorganisms-11-00949-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb1a/10146031/d9ff50e5d6af/microorganisms-11-00949-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb1a/10146031/f1aff0cb405a/microorganisms-11-00949-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb1a/10146031/2e7acc21a329/microorganisms-11-00949-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb1a/10146031/ec6d866912ac/microorganisms-11-00949-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb1a/10146031/f3fc4cef3de8/microorganisms-11-00949-g005.jpg

相似文献

1
Information Scale Correction for Varying Length Amplicons Improves Eukaryotic Microbiome Data Integration.针对不同长度扩增子的信息尺度校正可改善真核微生物组数据整合
Microorganisms. 2023 Apr 6;11(4):949. doi: 10.3390/microorganisms11040949.
2
Primer, Pipelines, Parameters: Issues in 16S rRNA Gene Sequencing.引物、流程、参数:16S rRNA 基因测序中的问题。
mSphere. 2021 Feb 24;6(1):e01202-20. doi: 10.1128/mSphere.01202-20.
3
Kelpie: generating full-length 'amplicons' from whole-metagenome datasets.凯尔皮:从全宏基因组数据集中生成全长“扩增子”。
PeerJ. 2019 Jan 30;6:e6174. doi: 10.7717/peerj.6174. eCollection 2019.
4
Different Amplicon Targets for Sequencing-Based Studies of Fungal Diversity.用于基于测序的真菌多样性研究的不同扩增子靶标
Appl Environ Microbiol. 2017 Aug 17;83(17). doi: 10.1128/AEM.00905-17. Print 2017 Sep 1.
5
Evaluating and Improving Small Subunit rRNA PCR Primer Coverage for Bacteria, Archaea, and Eukaryotes Using Metagenomes from Global Ocean Surveys.利用全球海洋调查的宏基因组评估和改进针对细菌、古菌和真核生物的小亚基rRNA PCR引物覆盖范围
mSystems. 2021 Jun 29;6(3):e0056521. doi: 10.1128/mSystems.00565-21. Epub 2021 Jun 1.
6
Impact of Bead-Beating Intensity on the Genus- and Species-Level Characterization of the Gut Microbiome Using Amplicon and Complete 16S rRNA Gene Sequencing.珠磨强度对基于扩增子和完整 16S rRNA 基因测序的肠道微生物组属和种水平特征分析的影响。
Front Cell Infect Microbiol. 2021 Oct 1;11:678522. doi: 10.3389/fcimb.2021.678522. eCollection 2021.
7
Multi-amplicon microbiome data analysis pipelines for mixed orientation sequences using QIIME2: Assessing reference database, variable region and pre-processing bias in classification of mock bacterial community samples.多扩增子微生物组数据分析管道,用于处理混合方向序列,使用 QIIME2:评估模拟细菌群落样本分类中的参考数据库、可变区和预处理偏差。
PLoS One. 2023 Jan 13;18(1):e0280293. doi: 10.1371/journal.pone.0280293. eCollection 2023.
8
[Community Diversity of Eukaryotic Nano-phytoplankton in Yellow Sea Using DNA Metabarcoding Technology Based on Multiple Amplicons].基于多种扩增子的DNA宏条形码技术分析黄海真核微型浮游植物的群落多样性
Huan Jing Ke Xue. 2019 Sep 8;40(9):4052-4060. doi: 10.13227/j.hjkx.201811025.
9
A Viability Quantitative PCR Dilemma: Are Longer Amplicons Better?定量 PCR 法检测细胞活力:长片段扩增子更好吗?
Appl Environ Microbiol. 2021 Feb 12;87(5):e0265320. doi: 10.1128/AEM.02653-20. Epub 2020 Dec 23.
10
Mining environmental high-throughput sequence data sets to identify divergent amplicon clusters for phylogenetic reconstruction and morphotype visualization.挖掘环境高通量序列数据集,以识别用于系统发育重建和形态型可视化的不同扩增子簇。
Environ Microbiol Rep. 2015 Aug;7(4):679-86. doi: 10.1111/1758-2229.12307.

引用本文的文献

1
Primer selection impacts the evaluation of microecological patterns in environmental microbiomes.引物选择会影响对环境微生物群落中微生态模式的评估。
Imeta. 2023 Sep 17;2(4):e135. doi: 10.1002/imt2.135. eCollection 2023 Nov.

本文引用的文献

1
Muscle5: High-accuracy alignment ensembles enable unbiased assessments of sequence homology and phylogeny.肌肉 5:高精度比对集合可实现序列同源性和系统发育的无偏评估。
Nat Commun. 2022 Nov 15;13(1):6968. doi: 10.1038/s41467-022-34630-w.
2
Priorities for ocean microbiome research.海洋微生物组研究的优先事项。
Nat Microbiol. 2022 Jul;7(7):937-947. doi: 10.1038/s41564-022-01145-5. Epub 2022 Jun 30.
3
Patterns of eukaryotic diversity from the surface to the deep-ocean sediment.从海洋表层到深海沉积物的真核生物多样性模式。
Sci Adv. 2022 Feb 4;8(5):eabj9309. doi: 10.1126/sciadv.abj9309.
4
Amplicon Sequence Variants Artificially Split Bacterial Genomes into Separate Clusters.扩增子序列变异将细菌基因组人为地分成单独的聚类。
mSphere. 2021 Aug 25;6(4):e0019121. doi: 10.1128/mSphere.00191-21. Epub 2021 Jul 21.
5
pr2-primers: An 18S rRNA primer database for protists.pr2引物:一个针对原生生物的18S rRNA引物数据库。
Mol Ecol Resour. 2022 Jan;22(1):168-179. doi: 10.1111/1755-0998.13465. Epub 2021 Jul 29.
6
High-accuracy long-read amplicon sequences using unique molecular identifiers with Nanopore or PacBio sequencing.使用独特分子标识符结合纳米孔或PacBio测序的高精度长读长扩增子序列。
Nat Methods. 2021 Feb;18(2):165-169. doi: 10.1038/s41592-020-01041-y. Epub 2021 Jan 11.
7
Assessment of and Protocols for Sequence-Based Characterization of the Human Vaginal Microbiome.基于序列的人类阴道微生物组特征分析评估及方案。
mSphere. 2020 Nov 18;5(6):e00448-20. doi: 10.1128/mSphere.00448-20.
8
Zooplankton biogeographic boundaries in the California Current System as determined from metabarcoding.基于 metabarcoding 技术确定的加利福尼亚海流系统中的浮游动物生物地理边界
PLoS One. 2020 Jun 25;15(6):e0235159. doi: 10.1371/journal.pone.0235159. eCollection 2020.
9
Tracing the Origin of Planktonic Protists in an Ancient Lake.探寻古代湖泊中浮游原生生物的起源
Microorganisms. 2020 Apr 9;8(4):543. doi: 10.3390/microorganisms8040543.
10
MitoFinder: Efficient automated large-scale extraction of mitogenomic data in target enrichment phylogenomics.MitoFinder:目标富集系统发育基因组学中高效自动化的大规模线粒体基因组数据提取。
Mol Ecol Resour. 2020 Jul;20(4):892-905. doi: 10.1111/1755-0998.13160. Epub 2020 Apr 25.