• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

方差分量选择及其在微生物组分类数据中的应用

Variance Component Selection With Applications to Microbiome Taxonomic Data.

作者信息

Zhai Jing, Kim Juhyun, Knox Kenneth S, Twigg Homer L, Zhou Hua, Zhou Jin J

机构信息

Department of Epidemiology and Biostatistics, University of Arizona, Tucson, AZ, United States.

Department of Biostatistics, University of California, Los Angeles, Los Angeles, CA, United States.

出版信息

Front Microbiol. 2018 Mar 28;9:509. doi: 10.3389/fmicb.2018.00509. eCollection 2018.

DOI:10.3389/fmicb.2018.00509
PMID:29643839
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5883493/
Abstract

High-throughput sequencing technology has enabled population-based studies of the role of the human microbiome in disease etiology and exposure response. Microbiome data are summarized as counts or composition of the bacterial taxa at different taxonomic levels. An important problem is to identify the bacterial taxa that are associated with a response. One method is to test the association of specific taxon with phenotypes in a linear mixed effect model, which incorporates phylogenetic information among bacterial communities. Another type of approaches consider all taxa in a joint model and achieves selection via penalization method, which ignores phylogenetic information. In this paper, we consider regression analysis by treating bacterial taxa at different level as multiple random effects. For each taxon, a kernel matrix is calculated based on distance measures in the phylogenetic tree and acts as one variance component in the joint model. Then taxonomic selection is achieved by the lasso (least absolute shrinkage and selection operator) penalty on variance components. Our method integrates biological information into the variable selection problem and greatly improves selection accuracies. Simulation studies demonstrate the superiority of our methods versus existing methods, for example, group-lasso. Finally, we apply our method to a longitudinal microbiome study of Human Immunodeficiency Virus (HIV) infected patients. We implement our method using the high performance computing language Julia. Software and detailed documentation are freely available at https://github.com/JingZhai63/VCselection.

摘要

高通量测序技术使基于人群的人类微生物组在疾病病因和暴露反应中作用的研究成为可能。微生物组数据被总结为不同分类水平上细菌类群的计数或组成。一个重要问题是识别与反应相关的细菌类群。一种方法是在包含细菌群落间系统发育信息的线性混合效应模型中测试特定分类群与表型的关联。另一类方法在联合模型中考虑所有分类群,并通过惩罚方法进行选择,这种方法忽略了系统发育信息。在本文中,我们通过将不同水平的细菌类群视为多个随机效应来进行回归分析。对于每个分类群,基于系统发育树中的距离度量计算一个核矩阵,并将其作为联合模型中的一个方差分量。然后通过对方差分量施加套索(最小绝对收缩和选择算子)惩罚来实现分类选择。我们的方法将生物学信息整合到变量选择问题中,极大地提高了选择准确性。模拟研究证明了我们的方法相对于现有方法(如组套索)的优越性。最后,我们将我们的方法应用于一项对人类免疫缺陷病毒(HIV)感染患者的纵向微生物组研究。我们使用高性能计算语言Julia实现了我们的方法。软件和详细文档可在https://github.com/JingZhai63/VCselection免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/3ef98468e750/fmicb-09-00509-g0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/e1d34c66822b/fmicb-09-00509-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/f22448c65000/fmicb-09-00509-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/9b4a53100485/fmicb-09-00509-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/fd344acffc83/fmicb-09-00509-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/07187c57d8cd/fmicb-09-00509-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/02805e9a14ca/fmicb-09-00509-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/8f160596540d/fmicb-09-00509-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/82c2c17e2ff5/fmicb-09-00509-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/7368e71d90d6/fmicb-09-00509-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/3ef98468e750/fmicb-09-00509-g0010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/e1d34c66822b/fmicb-09-00509-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/f22448c65000/fmicb-09-00509-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/9b4a53100485/fmicb-09-00509-g0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/fd344acffc83/fmicb-09-00509-g0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/07187c57d8cd/fmicb-09-00509-g0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/02805e9a14ca/fmicb-09-00509-g0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/8f160596540d/fmicb-09-00509-g0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/82c2c17e2ff5/fmicb-09-00509-g0008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/7368e71d90d6/fmicb-09-00509-g0009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa7d/5883493/3ef98468e750/fmicb-09-00509-g0010.jpg

相似文献

1
Variance Component Selection With Applications to Microbiome Taxonomic Data.方差分量选择及其在微生物组分类数据中的应用
Front Microbiol. 2018 Mar 28;9:509. doi: 10.3389/fmicb.2018.00509. eCollection 2018.
2
Exact variance component tests for longitudinal microbiome studies.纵向微生物组研究的精确方差成分检验。
Genet Epidemiol. 2019 Apr;43(3):250-262. doi: 10.1002/gepi.22185. Epub 2019 Jan 8.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
Phylogeny-guided microbiome OTU-specific association test (POST).基于系统发育的微生物群落 OTU 特异性关联检验(POST)。
Microbiome. 2022 Jun 7;10(1):86. doi: 10.1186/s40168-022-01266-3.
5
Testing in Microbiome-Profiling Studies with MiRKAT, the Microbiome Regression-Based Kernel Association Test.使用MiRKAT(基于微生物组回归的核关联测试)进行微生物组分析研究中的测试。
Am J Hum Genet. 2015 May 7;96(5):797-807. doi: 10.1016/j.ajhg.2015.04.003.
6
Part 1. Statistical Learning Methods for the Effects of Multiple Air Pollution Constituents.第1部分. 多种空气污染成分影响的统计学习方法
Res Rep Health Eff Inst. 2015 Jun(183 Pt 1-2):5-50.
7
VARIABLE SELECTION FOR SPARSE DIRICHLET-MULTINOMIAL REGRESSION WITH AN APPLICATION TO MICROBIOME DATA ANALYSIS.用于稀疏狄利克雷-多项回归的变量选择及其在微生物组数据分析中的应用
Ann Appl Stat. 2013 Mar 1;7(1). doi: 10.1214/12-AOAS592.
8
Selection of models for the analysis of risk-factor trees: leveraging biological knowledge to mine large sets of risk factors with application to microbiome data.用于风险因素树分析的模型选择:利用生物学知识挖掘大量风险因素并应用于微生物组数据
Bioinformatics. 2015 May 15;31(10):1607-13. doi: 10.1093/bioinformatics/btu855. Epub 2015 Jan 6.
9
A small-sample kernel association test for correlated data with application to microbiome association studies.一种用于相关数据的小样本核关联检验及其在微生物组关联研究中的应用。
Genet Epidemiol. 2018 Dec;42(8):772-782. doi: 10.1002/gepi.22160. Epub 2018 Sep 15.
10
Bayesian Variable Shrinkage and Selection in Compositional Data Regression: Application to Oral Microbiome.成分数据回归中的贝叶斯变量收缩与选择:在口腔微生物组中的应用
J Indian Soc Probab Stat. 2024;25(2):491-515. doi: 10.1007/s41096-024-00194-9. Epub 2024 May 29.

引用本文的文献

1
DeepBiome: A Phylogenetic Tree Informed Deep Neural Network for Microbiome Data Analysis.深度生物群落:一种基于系统发育树的深度神经网络用于微生物组数据分析。
Stat Biosci. 2025 Apr;17(1):191-215. doi: 10.1007/s12561-024-09434-9. Epub 2024 Jun 14.
2
VCSEL: PRIORITIZING SNP-SET BY PENALIZED VARIANCE COMPONENT SELECTION.垂直腔面发射激光器:通过惩罚方差分量选择对单核苷酸多态性集进行优先级排序。
Ann Appl Stat. 2021 Dec;15(4):1652-1672. doi: 10.1214/21-aoas1491. Epub 2021 Dec 21.
3
Tree-aggregated predictive modeling of microbiome data.基于树的微生物组数据预测模型构建。

本文引用的文献

1
MM Algorithms For Variance Components Models.方差分量模型的MM算法
J Comput Graph Stat. 2019;28(2):350-361. doi: 10.1080/10618600.2018.1529601. Epub 2019 Mar 9.
2
Gut microbiome modulates response to anti-PD-1 immunotherapy in melanoma patients.肠道微生物群调节黑色素瘤患者对抗PD-1免疫疗法的反应。
Science. 2018 Jan 5;359(6371):97-103. doi: 10.1126/science.aan4236. Epub 2017 Nov 2.
3
Ridle for sparse regression with mandatory covariates with application to the genetic assessment of histologic grades of breast cancer.
Sci Rep. 2021 Jul 15;11(1):14505. doi: 10.1038/s41598-021-93645-3.
4
Why targeting the microbiome is not so successful: can randomness overcome the adaptation that occurs following gut manipulation?为何针对微生物群的方法并不那么成功:随机性能否克服肠道干预后出现的适应性?
Clin Exp Gastroenterol. 2019 May 8;12:209-217. doi: 10.2147/CEG.S203823. eCollection 2019.
5
pldist: ecological dissimilarities for paired and longitudinal microbiome association analysis.pldist:用于配对和纵向微生物组关联分析的生态差异。
Bioinformatics. 2019 Oct 1;35(19):3567-3575. doi: 10.1093/bioinformatics/btz120.
6
Switching control strategy for the HIV dynamic system with some unknown parameters.具有一些未知参数的HIV动态系统的切换控制策略。
IET Syst Biol. 2019 Feb;13(1):30-35. doi: 10.1049/iet-syb.2018.5052.
带有强制协变量的稀疏回归难题及其在乳腺癌组织学分级基因评估中的应用
BMC Med Res Methodol. 2017 Jan 25;17(1):12. doi: 10.1186/s12874-017-0291-y.
4
Metagenome-wide association studies: fine-mining the microbiome.宏基因组关联研究:从微生物组中精细挖掘。
Nat Rev Microbiol. 2016 Aug;14(8):508-22. doi: 10.1038/nrmicro.2016.83. Epub 2016 Jul 11.
5
A two-part mixed-effects model for analyzing longitudinal microbiome compositional data.一种用于分析纵向微生物组组成数据的两部分混合效应模型。
Bioinformatics. 2016 Sep 1;32(17):2611-7. doi: 10.1093/bioinformatics/btw308. Epub 2016 May 14.
6
Effect of Advanced HIV Infection on the Respiratory Microbiome.晚期HIV感染对呼吸道微生物群的影响。
Am J Respir Crit Care Med. 2016 Jul 15;194(2):226-35. doi: 10.1164/rccm.201509-1875OC.
7
The human microbiome, asthma, and allergy.人类微生物组、哮喘与过敏。
Allergy Asthma Clin Immunol. 2015 Dec 10;11:35. doi: 10.1186/s13223-015-0102-0. eCollection 2015.
8
Synthetic long-read sequencing reveals intraspecies diversity in the human microbiome.合成长读长测序揭示了人类微生物组中的种内多样性。
Nat Biotechnol. 2016 Jan;34(1):64-9. doi: 10.1038/nbt.3416. Epub 2015 Dec 14.
9
CpGFilter: model-based CpG probe filtering with replicates for epigenome-wide association studies.CpGFilter:用于全表观基因组关联研究的基于模型的带有重复样本的CpG探针过滤方法
Bioinformatics. 2016 Feb 1;32(3):469-71. doi: 10.1093/bioinformatics/btv577. Epub 2015 Oct 7.
10
glmgraph: an R package for variable selection and predictive modeling of structured genomic data.glmgraph:一个用于结构化基因组数据变量选择和预测建模的R包。
Bioinformatics. 2015 Dec 15;31(24):3991-3. doi: 10.1093/bioinformatics/btv497. Epub 2015 Aug 26.