• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于健康生物标志物分类的可解释对数对比:一种平衡选择的新方法。

Interpretable Log Contrasts for the Classification of Health Biomarkers: a New Approach to Balance Selection.

作者信息

Quinn Thomas P, Erb Ionas

机构信息

Independent Scientist, Geelong, Australia

Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain.

出版信息

mSystems. 2020 Apr 7;5(2):e00230-19. doi: 10.1128/mSystems.00230-19.

DOI:10.1128/mSystems.00230-19
PMID:32265314
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7141889/
Abstract

Since the turn of the century, technological advances have made it possible to obtain the molecular profile of any tissue in a cost-effective manner. Among these advances are sophisticated high-throughput assays that measure the relative abundances of microorganisms, RNA molecules, and metabolites. While these data are most often collected to gain new insights into biological systems, they can also be used as biomarkers to create clinically useful diagnostic classifiers. How best to classify high-dimensional -omics data remains an area of active research. However, few explicitly model the relative nature of these data and instead rely on cumbersome normalizations. This report (i) emphasizes the relative nature of health biomarkers, (ii) discusses the literature surrounding the classification of relative data, and (iii) benchmarks how different transformations perform for regularized logistic regression across multiple biomarker types. We show how an interpretable set of log contrasts, called balances, can prepare data for classification. We propose a simple procedure, called discriminative balance analysis, to select groups of 2 and 3 bacteria that can together discriminate between experimental conditions. Discriminative balance analysis is a fast, accurate, and interpretable alternative to data normalization. High-throughput sequencing provides an easy and cost-effective way to measure the relative abundance of bacteria in any environmental or biological sample. When these samples come from humans, the microbiome signatures can act as biomarkers for disease prediction. However, because bacterial abundance is measured as a composition, the data have unique properties that make conventional analyses inappropriate. To overcome this, analysts often use cumbersome normalizations. This article proposes an alternative method that identifies pairs and trios of bacteria whose stoichiometric presence can differentiate between diseased and nondiseased samples. By using interpretable log contrasts called balances, we developed an entirely normalization-free classification procedure that reduces the feature space and improves the interpretability, without sacrificing classifier performance.

摘要

自世纪之交以来,技术进步使得以经济高效的方式获取任何组织的分子图谱成为可能。这些进步包括先进的高通量检测方法,可测量微生物、RNA分子和代谢物的相对丰度。虽然收集这些数据大多是为了深入了解生物系统,但它们也可用作生物标志物来创建临床有用的诊断分类器。如何最好地对高维组学数据进行分类仍是一个活跃的研究领域。然而,很少有方法明确对这些数据的相对性质进行建模,而是依赖繁琐的归一化处理。本报告(i)强调健康生物标志物的相对性质,(ii)讨论围绕相对数据分类的文献,以及(iii)对多种生物标志物类型的正则化逻辑回归中不同变换的性能进行基准测试。我们展示了一组可解释的对数对比(称为平衡)如何为分类准备数据。我们提出了一种简单的程序,称为判别平衡分析,以选择能够共同区分实验条件的2种和3种细菌组合。判别平衡分析是一种快速、准确且可解释的数据归一化替代方法。高通量测序提供了一种简单且经济高效的方法来测量任何环境或生物样本中细菌的相对丰度。当这些样本来自人类时,微生物组特征可作为疾病预测的生物标志物。然而,由于细菌丰度是以组成形式测量的,这些数据具有独特的性质,使得传统分析并不适用。为了克服这一点,分析人员通常使用繁琐的归一化方法。本文提出了一种替代方法,该方法识别其化学计量存在能够区分患病和未患病样本的细菌对和细菌三元组。通过使用称为平衡的可解释对数对比,我们开发了一种完全无需归一化的分类程序,该程序在不牺牲分类器性能的情况下减少了特征空间并提高了可解释性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a83/7141889/d428ebeced63/mSystems.00230-19-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a83/7141889/30176be0bc38/mSystems.00230-19-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a83/7141889/d5f15e56a22b/mSystems.00230-19-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a83/7141889/d16fa25ee8fb/mSystems.00230-19-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a83/7141889/38de60217e3e/mSystems.00230-19-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a83/7141889/d428ebeced63/mSystems.00230-19-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a83/7141889/30176be0bc38/mSystems.00230-19-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a83/7141889/d5f15e56a22b/mSystems.00230-19-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a83/7141889/d16fa25ee8fb/mSystems.00230-19-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a83/7141889/38de60217e3e/mSystems.00230-19-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a83/7141889/d428ebeced63/mSystems.00230-19-f0005.jpg

相似文献

1
Interpretable Log Contrasts for the Classification of Health Biomarkers: a New Approach to Balance Selection.用于健康生物标志物分类的可解释对数对比:一种平衡选择的新方法。
mSystems. 2020 Apr 7;5(2):e00230-19. doi: 10.1128/mSystems.00230-19.
2
White box radial basis function classifiers with component selection for clinical prediction models.基于组件选择的白盒径向基函数分类器在临床预测模型中的应用。
Artif Intell Med. 2014 Jan;60(1):53-64. doi: 10.1016/j.artmed.2013.10.001. Epub 2013 Oct 18.
3
GutBalance: a server for the human gut microbiome-based disease prediction and biomarker discovery with compositionality addressed.肠道平衡:一个针对人类肠道微生物组的疾病预测和生物标志物发现的服务器,解决了组合性问题。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa436.
4
Balances: a New Perspective for Microbiome Analysis.平衡:微生物组分析的新视角
mSystems. 2018 Jul 17;3(4). doi: 10.1128/mSystems.00053-18. eCollection 2018 Jul-Aug.
5
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
6
Early prediction of radiotherapy-induced parotid shrinkage and toxicity based on CT radiomics and fuzzy classification.基于 CT 放射组学和模糊分类的放疗后腮腺早期收缩和毒性的预测。
Artif Intell Med. 2017 Sep;81:41-53. doi: 10.1016/j.artmed.2017.03.004. Epub 2017 Mar 18.
7
Proportion-based normalizations outperform compositional data transformations in machine learning applications.基于比例的归一化在机器学习应用中优于成分数据变换。
Microbiome. 2024 Mar 5;12(1):45. doi: 10.1186/s40168-023-01747-z.
8
Engineering Aspects of Olfaction嗅觉的工程学方面
9
Identification of city specific important bacterial signature for the MetaSUB CAMDA challenge microbiome data.鉴定城市特有重要细菌特征,用于 MetaSUB CAMDA 挑战赛微生物组数据。
Biol Direct. 2019 Jul 24;14(1):11. doi: 10.1186/s13062-019-0243-z.
10
Principal microbial groups: compositional alternative to phylogenetic grouping of microbiome data.主要微生物群:微生物组数据的组成替代分类群。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac328.

引用本文的文献

1
Exploring the role of normalization and feature selection in microbiome disease classification pipelines.探索标准化和特征选择在微生物组疾病分类流程中的作用。
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf096.
2
Citywide metagenomic surveillance of food centres reveals local microbial signatures and antibiotic resistance gene enrichment.全市食品中心的宏基因组监测揭示了当地的微生物特征和抗生素抗性基因富集。
NPJ Antimicrob Resist. 2025 Jul 8;3(1):63. doi: 10.1038/s44259-025-00132-0.
3
A benchmark analysis of feature selection and machine learning methods for environmental metabarcoding datasets.

本文引用的文献

1
A field guide for the compositional analysis of any-omics data.任何组学数据的组成分析指南。
Gigascience. 2019 Sep 1;8(9). doi: 10.1093/gigascience/giz107.
2
A Novel Sparse Compositional Technique Reveals Microbial Perturbations.一种新型稀疏合成技术揭示了微生物扰动。
mSystems. 2019 Feb 12;4(1). doi: 10.1128/mSystems.00016-19. eCollection 2019 Jan-Feb.
3
Gut microbiome structure and metabolic activity in inflammatory bowel disease.炎症性肠病中的肠道微生物组结构和代谢活性。
环境宏条形码数据集的特征选择和机器学习方法的基准分析
Comput Struct Biotechnol J. 2025 Apr 16;27:1636-1647. doi: 10.1016/j.csbj.2025.04.017. eCollection 2025.
4
Effects of data transformation and model selection on feature importance in microbiome classification data.数据转换和模型选择对微生物组分类数据中特征重要性的影响。
Microbiome. 2025 Jan 4;13(1):2. doi: 10.1186/s40168-024-01996-6.
5
Three approaches to supervised learning for compositional data with pairwise logratios.用于具有成对对数比率的成分数据的监督学习的三种方法。
J Appl Stat. 2022 Aug 6;50(16):3272-3293. doi: 10.1080/02664763.2022.2108007. eCollection 2023.
6
Overview of data preprocessing for machine learning applications in human microbiome research.人类微生物组研究中机器学习应用的数据预处理概述。
Front Microbiol. 2023 Oct 5;14:1250909. doi: 10.3389/fmicb.2023.1250909. eCollection 2023.
7
Approximation of a Microbiome Composition Shift by a Change in a Single Balance Between Two Groups of Taxa.两组分类群间单一平衡变化引起的微生物群落组成偏移的逼近。
mSystems. 2022 Jun 28;7(3):e0015522. doi: 10.1128/msystems.00155-22. Epub 2022 May 9.
8
tascCODA: Bayesian Tree-Aggregated Analysis of Compositional Amplicon and Single-Cell Data.tascCODA:成分扩增子和单细胞数据的贝叶斯树聚合分析
Front Genet. 2021 Dec 7;12:766405. doi: 10.3389/fgene.2021.766405. eCollection 2021.
9
Gut Microbiota as Early Predictor of Infectious Complications before Cardiac Surgery: A Prospective Pilot Study.肠道微生物群作为心脏手术前感染并发症的早期预测指标:一项前瞻性试点研究。
J Pers Med. 2021 Oct 29;11(11):1113. doi: 10.3390/jpm11111113.
10
Impact of Plant-Based Meat Alternatives on the Gut Microbiota of Consumers: A Real-World Study.植物性肉类替代品对消费者肠道微生物群的影响:一项现实世界研究
Foods. 2021 Aug 30;10(9):2040. doi: 10.3390/foods10092040.
Nat Microbiol. 2019 Feb;4(2):293-305. doi: 10.1038/s41564-018-0306-4. Epub 2018 Dec 10.
4
Visualizing balances of compositional data: A new alternative to balance dendrograms.可视化成分数据的平衡:平衡树状图的一种新替代方法。
F1000Res. 2018 Aug 14;7:1278. doi: 10.12688/f1000research.15858.1. eCollection 2018.
5
How does normalization impact RNA-seq disease diagnosis?归一化如何影响 RNA-seq 疾病诊断?
J Biomed Inform. 2018 Sep;85:80-92. doi: 10.1016/j.jbi.2018.07.016. Epub 2018 Jul 21.
6
Balances: a New Perspective for Microbiome Analysis.平衡:微生物组分析的新视角
mSystems. 2018 Jul 17;3(4). doi: 10.1128/mSystems.00053-18. eCollection 2018 Jul-Aug.
7
The Gut Microbiome Profile in Obesity: A Systematic Review.肥胖中的肠道微生物群概况:一项系统综述。
Int J Endocrinol. 2018 Mar 22;2018:4095789. doi: 10.1155/2018/4095789. eCollection 2018.
8
Understanding sequencing data as compositions: an outlook and review.理解测序数据作为组成:展望与回顾。
Bioinformatics. 2018 Aug 15;34(16):2870-2878. doi: 10.1093/bioinformatics/bty175.
9
: an R-package for the rapid implementation of machine learning algorithms.用于快速实现机器学习算法的R包。
F1000Res. 2016 Oct 27;5:2588. doi: 10.12688/f1000research.9893.2. eCollection 2016.
10
Meta-analysis of gut microbiome studies identifies disease-specific and shared responses.基于宏基因组关联研究的肠道微生物组分析鉴定出疾病特异性和共享反应。
Nat Commun. 2017 Dec 5;8(1):1784. doi: 10.1038/s41467-017-01973-8.