PERFect：微生物组数据的排列过滤测试。

PERFect: PERmutation Filtering test for microbiome data.

机构信息

Department of Mathematical Sciences, University of Montana, 32 Campus Dr., Missoula, MT, USA.

Department of Biostatistics, West Virginia University, 1 Medical Center Dr., Morgantown, WV, USA.

出版信息

Biostatistics. 2019 Oct 1;20(4):615-631. doi: 10.1093/biostatistics/kxy020.

DOI:10.1093/biostatistics/kxy020

PMID:29917060

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6797060/

Abstract

The human microbiota composition is associated with a number of diseases including obesity, inflammatory bowel disease, and bacterial vaginosis. Thus, microbiome research has the potential to reshape clinical and therapeutic approaches. However, raw microbiome count data require careful pre-processing steps that take into account both the sparsity of counts and the large number of taxa that are being measured. Filtering is defined as removing taxa that are present in a small number of samples and have small counts in the samples where they are observed. Despite progress in the number and quality of filtering approaches, there is no consensus on filtering standards and quality assessment. This can adversely affect downstream analyses and reproducibility of results across platforms and software. We introduce PERFect, a novel permutation filtering approach designed to address two unsolved problems in microbiome data processing: (i) define and quantify loss due to filtering by implementing thresholds and (ii) introduce and evaluate a permutation test for filtering loss to provide a measure of excessive filtering. Methods are assessed on three "mock experiment" data sets, where the true taxa compositions are known, and are applied to two publicly available real microbiome data sets. The method correctly removes contaminant taxa in "mock" data sets, quantifies and visualizes the corresponding filtering loss, providing a uniform data-driven filtering criteria for real microbiome data sets. In real data analyses PERFect tends to remove more taxa than existing approaches; this likely happens because the method is based on an explicit loss function, uses statistically principled testing, and takes into account correlation between taxa. The PERFect software is freely available at https://github.com/katiasmirn/PERFect.

摘要

人类微生物群落组成与许多疾病有关，包括肥胖、炎症性肠病和细菌性阴道病。因此，微生物组研究有可能重塑临床和治疗方法。然而，原始微生物组计数数据需要仔细的预处理步骤，既要考虑计数的稀疏性，又要考虑正在测量的大量分类单元。过滤是指去除在少数样本中存在且在观察到的样本中计数较小的分类单元。尽管过滤方法在数量和质量上都有所进步，但在过滤标准和质量评估方面仍没有共识。这会对下游分析和跨平台及软件的结果重现性产生不利影响。我们引入了 PERFect，这是一种新的排列过滤方法，旨在解决微生物组数据处理中的两个未解决的问题：(i) 通过实施阈值来定义和量化过滤损失，以及 (ii) 引入和评估过滤损失的排列检验，以提供过度过滤的度量。该方法在三个“模拟实验”数据集上进行了评估，其中已知真实的分类单元组成，并应用于两个公开的可用真实微生物组数据集。该方法能够正确地从“模拟”数据集中去除污染物分类单元，量化和可视化相应的过滤损失，为真实微生物组数据集提供了统一的数据驱动的过滤标准。在真实数据分析中，PERFect 倾向于去除比现有方法更多的分类单元；这可能是因为该方法基于显式的损失函数，使用统计上有原则的检验，并考虑了分类单元之间的相关性。PERFect 软件可在 https://github.com/katiasmirn/PERFect 上免费获得。

相似文献

PERFect: PERmutation Filtering test for microbiome data.PERFect：微生物组数据的排列过滤测试。

Biostatistics. 2019 Oct 1;20(4):615-631. doi: 10.1093/biostatistics/kxy020.

Effects of Rare Microbiome Taxa Filtering on Statistical Analysis.稀有微生物群落分类过滤对统计分析的影响。

Front Microbiol. 2021 Jan 12;11:607325. doi: 10.3389/fmicb.2020.607325. eCollection 2020.

Filtering ASVs/OTUs via mutual information-based microbiome network analysis.基于互信息的微生物组网络分析筛选 ASVs/OTUs。

BMC Bioinformatics. 2022 Sep 16;23(1):380. doi: 10.1186/s12859-022-04919-0.

Species-level bacterial community profiling of the healthy sinonasal microbiome using Pacific Biosciences sequencing of full-length 16S rRNA genes.采用 Pacific Biosciences 全长 16S rRNA 基因测序技术对健康鼻窦微生物组进行细菌群落物种水平分析。

Microbiome. 2018 Oct 23;6(1):190. doi: 10.1186/s40168-018-0569-2.

Taxanorm: a novel taxa-specific normalization approach for microbiome data.Taxanorm：一种用于微生物组数据的新型分类群特异性标准化方法。

BMC Bioinformatics. 2024 Sep 16;25(1):304. doi: 10.1186/s12859-024-05918-z.

Characterization and Demonstration of Mock Communities as Control Reagents for Accurate Human Microbiome Community Measurements.模拟群落的特征化和验证作为准确的人类微生物群落测量的对照试剂。

Microbiol Spectr. 2022 Apr 27;10(2):e0191521. doi: 10.1128/spectrum.01915-21. Epub 2022 Mar 2.

Benchmarking MicrobIEM - a user-friendly tool for decontamination of microbiome sequencing data.基准测试 MicrobIEM-一个用于宏基因组测序数据去污染的用户友好工具。

BMC Biol. 2023 Nov 23;21(1):269. doi: 10.1186/s12915-023-01737-5.

It's all relative: analyzing microbiome data as compositions.一切都是相对的：将微生物组数据作为成分进行分析。

Ann Epidemiol. 2016 May;26(5):322-9. doi: 10.1016/j.annepidem.2016.03.003. Epub 2016 Apr 2.

A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome.用于肠道微生物组组成分析的测序平台和生物信息学管道的比较。

BMC Microbiol. 2017 Sep 13;17(1):194. doi: 10.1186/s12866-017-1101-8.

A comprehensive evaluation of the sl1p pipeline for 16S rRNA gene sequencing analysis.SL1p 管道用于 16S rRNA 基因测序分析的综合评估。

Microbiome. 2017 Aug 14;5(1):100. doi: 10.1186/s40168-017-0314-2.

引用本文的文献

Impact of oral immunotherapy on diversity of gut microbiota in food-allergic children.口服免疫疗法对食物过敏儿童肠道微生物群多样性的影响。

Pediatr Allergy Immunol. 2025 Aug;36(8):e70156. doi: 10.1111/pai.70156.

Non-antibiotics disrupt colonization resistance against enteropathogens.非抗生素会破坏对肠道病原体的定植抗性。

Nature. 2025 Jul 16. doi: 10.1038/s41586-025-09217-2.

[1]The human gut microbiota in IBD, characterizing hubs, the core microbiota and terminal nodes: a network-based approach.[1]炎症性肠病中的人体肠道微生物群，表征枢纽、核心微生物群和终端节点：一种基于网络的方法。

BMC Microbiol. 2025 Jun 26;25(1):371. doi: 10.1186/s12866-025-04106-0.

micRoclean: an R package for decontaminating low-biomass 16S-rRNA microbiome data.micRoclean：一个用于净化低生物量16S-rRNA微生物组数据的R包。

Front Bioinform. 2025 May 8;5:1556361. doi: 10.3389/fbinf.2025.1556361. eCollection 2025.

An anti-virulence drug targeting the evolvability protein Mfd protects against infections with antimicrobial resistant ESKAPE pathogens.一种靶向进化性蛋白Mfd的抗毒力药物可抵御对抗菌素耐药的ESKAPE病原体感染。

Nat Commun. 2025 Apr 28;16(1):3324. doi: 10.1038/s41467-025-58282-8.

The fungal microbiota modulate neonatal oxygen-induced lung injury.真菌微生物群调节新生儿氧诱导的肺损伤。

Microbiome. 2025 Jan 27;13(1):24. doi: 10.1186/s40168-025-02032-x.

Relationship between vaginal and gut microbiome and pregnancy outcomes in eastern Ethiopia: a protocol for a longitudinal maternal-infant cohort study (the EthiOMICS study).埃塞俄比亚东部阴道和肠道微生物群与妊娠结局之间的关系：一项母婴纵向队列研究方案（埃塞俄比亚组学研究）

BMJ Open. 2025 Jan 6;15(1):e092461. doi: 10.1136/bmjopen-2024-092461.

Review and revamp of compositional data transformation: A new framework combining proportion conversion and contrast transformation.成分数据转换的回顾与改进：一个结合比例转换和对比转换的新框架。

Comput Struct Biotechnol J. 2024 Nov 8;23:4088-4107. doi: 10.1016/j.csbj.2024.11.003. eCollection 2024 Dec.

Effects of snake fungal disease (ophidiomycosis) on the skin microbiome across two major experimental scales.蛇真菌病（蛇类霉菌病）在两个主要实验尺度上对皮肤微生物群的影响。

Conserv Biol. 2025 Apr;39(2):e14411. doi: 10.1111/cobi.14411. Epub 2024 Nov 12.

Contributions of species to Pinot Noir microbial terroir in Oregon's Willamette Valley wine region.俄勒冈威拉米特谷葡萄酒产区黑皮诺葡萄酒的微生物风土中物种的贡献。

Appl Environ Microbiol. 2024 Sep 18;90(9):e0081024. doi: 10.1128/aem.00810-24. Epub 2024 Aug 13.

本文引用的文献

Simple statistical identification and removal of contaminant sequences in marker-gene and metagenomics data.简单的统计鉴定和去除标记基因和宏基因组数据中的污染物序列。

Microbiome. 2018 Dec 17;6(1):226. doi: 10.1186/s40168-018-0605-2.

An assessment of US microbiome research.美国微生物组研究评估。

Nat Microbiol. 2016 Jan 11;1:15015. doi: 10.1038/nmicrobiol.2015.15.

Low diversity of planktonic bacteria in the tropical ocean.热带海洋中浮游细菌的多样性较低。

Sci Rep. 2016 Jan 11;6:19054. doi: 10.1038/srep19054.

A Comparison of Base-calling Algorithms for Illumina Sequencing Technology.Illumina测序技术碱基识别算法的比较

Brief Bioinform. 2016 Sep;17(5):786-95. doi: 10.1093/bib/bbv088. Epub 2015 Oct 5.

The truth about metagenomics: quantifying and counteracting bias in 16S rRNA studies.宏基因组学的真相：量化和抵消16S rRNA研究中的偏差

BMC Microbiol. 2015 Mar 21;15:66. doi: 10.1186/s12866-015-0351-6.

Reagent and laboratory contamination can critically impact sequence-based microbiome analyses.试剂和实验室污染会严重影响基于序列的微生物组分析。

BMC Biol. 2014 Nov 12;12:87. doi: 10.1186/s12915-014-0087-z.

Preterm labor: one syndrome, many causes.早产：一种综合征，多种病因。

Science. 2014 Aug 15;345(6198):760-5. doi: 10.1126/science.1251816. Epub 2014 Aug 14.

Species-level classification of the vaginal microbiome.阴道微生物组的种水平分类。

BMC Genomics. 2012;13 Suppl 8(Suppl 8):S17. doi: 10.1186/1471-2164-13-S8-S17. Epub 2012 Dec 17.

Reciprocal interactions of the intestinal microbiota and immune system.肠道微生物群和免疫系统的相互作用。

Nature. 2012 Sep 13;489(7415):231-41. doi: 10.1038/nature11551.

Diversity, stability and resilience of the human gut microbiota.人类肠道微生物组的多样性、稳定性和弹性。

Nature. 2012 Sep 13;489(7415):220-30. doi: 10.1038/nature11550.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验