• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

分类学建模在微生物群数据挖掘中的应用,用于检测全球人群中的蠕虫感染。

Application of Taxonomic Modeling to Microbiota Data Mining for Detection of Helminth Infection in Global Populations.

作者信息

Torbati Mahbaneh Eshaghzadeh, Mitreva Makedonka, Gopalakrishnan Vanathi

机构信息

Department of Computer Science, University of Pittsburgh, 6135 Sennott Square, 210 S Bouquet St, Pittsburgh, PA 15260-9161, USA.

Department of Medicine, Washington University School of Medicine, 660 S Euclid Ave, St. Louis, MO 63110, USA.

出版信息

Data (Basel). 2016 Dec;1(3). doi: 10.3390/data1030019. Epub 2016 Dec 13.

DOI:10.3390/data1030019
PMID:28239609
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5325162/
Abstract

Human microbiome data from genomic sequencing technologies is fast accumulating, giving us insights into bacterial taxa that contribute to health and disease. The predictive modeling of such microbiota count data for the classification of human infection from parasitic worms, such as helminths, can help in the detection and management across global populations. Real-world datasets of microbiome experiments are typically sparse, containing hundreds of measurements for bacterial species, of which only a few are detected in the bio-specimens that are analyzed. This feature of microbiome data produces the challenge of needing more observations for accurate predictive modeling and has been dealt with previously, using different methods of feature reduction. To our knowledge, integrative methods, such as transfer learning, have not yet been explored in the microbiome domain as a way to deal with data sparsity by incorporating knowledge of different but related datasets. One way of incorporating this knowledge is by using a meaningful mapping among features of these datasets. In this paper, we claim that this mapping would exist among members of each individual cluster, grouped based on phylogenetic dependency among taxa and their association to the phenotype. We validate our claim by showing that models incorporating associations in such a grouped feature space result in no performance deterioration for the given classification task. In this paper, we test our hypothesis by using classification models that detect helminth infection in microbiota of human fecal samples obtained from Indonesia and Liberia countries. In our experiments, we first learn binary classifiers for helminth infection detection by using Naive Bayes, Support Vector Machines, Multilayer Perceptrons, and Random Forest methods. In the next step, we add taxonomic modeling by using the SMART-scan module to group the data, and learn classifiers using the same four methods, to test the validity of the achieved groupings. We observed a 6% to 23% and 7% to 26% performance improvement based on the Area Under the receiver operating characteristic (ROC) Curve (AUC) and Balanced Accuracy (Bacc) measures, respectively, over 10 runs of 10-fold cross-validation. These results show that using phylogenetic dependency for grouping our microbiota data actually results in a noticeable improvement in classification performance for helminth infection detection. These promising results from this feasibility study demonstrate that methods such as SMART-scan can be utilized in the future for knowledge transfer from different but related microbiome datasets by phylogenetically-related functional mapping, to enable novel integrative biomarker discovery.

摘要

来自基因组测序技术的人类微生物组数据正在迅速积累,使我们能够深入了解对健康和疾病有影响的细菌分类群。对这类微生物群计数数据进行预测建模,以对来自寄生虫(如蠕虫)的人类感染进行分类,有助于在全球人群中进行检测和管理。微生物组实验的实际数据集通常很稀疏,包含数百种细菌物种的测量数据,其中只有少数在分析的生物样本中被检测到。微生物组数据的这一特征给准确的预测建模带来了需要更多观测数据的挑战,并且之前已经使用不同的特征约简方法来处理这一问题。据我们所知,诸如迁移学习等整合方法尚未在微生物组领域中作为一种通过整合不同但相关数据集的知识来处理数据稀疏性的方式进行探索。整合这种知识的一种方法是在这些数据集的特征之间使用有意义的映射。在本文中,我们声称这种映射将存在于基于分类群之间的系统发育依赖性及其与表型的关联而分组的每个单独聚类的成员之间。我们通过表明在这样一个分组特征空间中纳入关联的模型对于给定的分类任务不会导致性能下降来验证我们的主张。在本文中,我们通过使用分类模型来检验我们的假设,这些模型用于检测从印度尼西亚和利比里亚国家获得的人类粪便样本微生物群中的蠕虫感染。在我们的实验中,我们首先使用朴素贝叶斯、支持向量机、多层感知器和随机森林方法学习用于蠕虫感染检测的二元分类器。在下一步中,我们使用SMART-scan模块添加分类建模以对数据进行分组,并使用相同的四种方法学习分类器,以测试所实现分组的有效性。在10次10折交叉验证的运行中,基于受试者工作特征(ROC)曲线下面积(AUC)和平衡准确率(Bacc)指标,我们分别观察到性能提高了6%至23%和7%至26%。这些结果表明,利用系统发育依赖性对我们的微生物群数据进行分组实际上会使蠕虫感染检测的分类性能有显著提高。这项可行性研究的这些有前景的结果表明,诸如SMART-scan这样的方法未来可用于通过系统发育相关的功能映射从不同但相关的微生物组数据集中进行知识转移,以实现新型整合生物标志物的发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a9d3/5325162/76aed06d5ad8/nihms846804f2a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a9d3/5325162/37c6c80085d2/nihms846804f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a9d3/5325162/76aed06d5ad8/nihms846804f2a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a9d3/5325162/37c6c80085d2/nihms846804f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a9d3/5325162/76aed06d5ad8/nihms846804f2a.jpg

相似文献

1
Application of Taxonomic Modeling to Microbiota Data Mining for Detection of Helminth Infection in Global Populations.分类学建模在微生物群数据挖掘中的应用,用于检测全球人群中的蠕虫感染。
Data (Basel). 2016 Dec;1(3). doi: 10.3390/data1030019. Epub 2016 Dec 13.
2
A Machine Learning Approach Reveals a Microbiota Signature for Infection with Mycobacterium avium subsp. in Cattle.机器学习方法揭示了牛分枝杆菌亚种感染的微生物组特征。
Microbiol Spectr. 2023 Feb 14;11(1):e0313422. doi: 10.1128/spectrum.03134-22. Epub 2023 Jan 19.
3
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
4
Machine Learning Strategy for Gut Microbiome-Based Diagnostic Screening of Cardiovascular Disease.基于肠道微生物组的心血管疾病诊断筛查的机器学习策略。
Hypertension. 2020 Nov;76(5):1555-1562. doi: 10.1161/HYPERTENSIONAHA.120.15885. Epub 2020 Sep 10.
5
Effects of Rare Microbiome Taxa Filtering on Statistical Analysis.稀有微生物群落分类过滤对统计分析的影响。
Front Microbiol. 2021 Jan 12;11:607325. doi: 10.3389/fmicb.2020.607325. eCollection 2020.
6
MICHELINdb: a web-based tool for mining of helminth-microbiota interaction datasets, and a meta-analysis of current research.MICHELINdb:一个用于挖掘寄生虫-微生物相互作用数据集的基于网络的工具,以及对当前研究的元分析。
Microbiome. 2020 Feb 3;8(1):10. doi: 10.1186/s40168-019-0782-7.
7
microBiomeGSM: the identification of taxonomic biomarkers from metagenomic data using grouping, scoring and modeling (G-S-M) approach.微生物群落GSM:使用分组、评分和建模(G-S-M)方法从宏基因组数据中识别分类学生物标志物。
Front Microbiol. 2023 Nov 22;14:1264941. doi: 10.3389/fmicb.2023.1264941. eCollection 2023.
8
Microbiome-based classification models for fresh produce safety and quality evaluation.基于微生物组的分类模型在新鲜农产品安全和质量评价中的应用。
Microbiol Spectr. 2024 Apr 2;12(4):e0344823. doi: 10.1128/spectrum.03448-23. Epub 2024 Mar 6.
9
Can Predictive Modeling Tools Identify Patients at High Risk of Prolonged Opioid Use After ACL Reconstruction?预测模型工具能否识别 ACL 重建术后阿片类药物使用时间延长的高风险患者?
Clin Orthop Relat Res. 2020 Jul;478(7):0-1618. doi: 10.1097/CORR.0000000000001251.
10
A novel deep learning method for predictive modeling of microbiome data.一种用于微生物组数据预测建模的新型深度学习方法。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa073.

引用本文的文献

1
Dynamic changes in human-gut microbiome in relation to a placebo-controlled anthelminthic trial in Indonesia.与印度尼西亚安慰剂对照驱虫试验相关的人类肠道微生物组的动态变化。
PLoS Negl Trop Dis. 2018 Aug 9;12(8):e0006620. doi: 10.1371/journal.pntd.0006620. eCollection 2018 Aug.
2
A Multi-Omics Database for Parasitic Nematodes and Trematodes.一个用于寄生线虫和吸虫的多组学数据库。
Methods Mol Biol. 2018;1757:371-397. doi: 10.1007/978-1-4939-7737-6_13.
3
Numerical analyses of intestinal microbiota by data mining.通过数据挖掘对肠道微生物群进行数值分析。

本文引用的文献

1
MixMC: A Multivariate Statistical Framework to Gain Insight into Microbial Communities.MixMC:一个用于深入了解微生物群落的多元统计框架。
PLoS One. 2016 Aug 11;11(8):e0160169. doi: 10.1371/journal.pone.0160169. eCollection 2016.
2
Antibiotic perturbation of the murine gut microbiome enhances the adiposity, insulin resistance, and liver disease associated with high-fat diet.抗生素对小鼠肠道微生物群的干扰会增强与高脂饮食相关的肥胖、胰岛素抵抗和肝脏疾病。
Genome Med. 2016 Apr 27;8(1):48. doi: 10.1186/s13073-016-0297-9.
3
Study of duodenal bacterial communities by 16S rRNA gene analysis in adults with active celiac disease vs non-celiac disease controls.
J Clin Biochem Nutr. 2018 Mar;62(2):124-131. doi: 10.3164/jcbn.17-84. Epub 2018 Jan 11.
4
Differential human gut microbiome assemblages during soil-transmitted helminth infections in Indonesia and Liberia.印度尼西亚和利比里亚土壤传播性蠕虫感染期间人类肠道微生物组的差异。
Microbiome. 2018 Feb 28;6(1):33. doi: 10.1186/s40168-018-0416-5.
通过16S rRNA基因分析对患有活动性乳糜泻的成年人与非乳糜泻对照者的十二指肠细菌群落进行研究。
J Appl Microbiol. 2016 Jun;120(6):1691-700. doi: 10.1111/jam.13111. Epub 2016 Apr 4.
4
The effect of dietary resistant starch type 2 on the microbiota and markers of gut inflammation in rural Malawi children.2型抗性淀粉对马拉维农村儿童微生物群和肠道炎症标志物的影响。
Microbiome. 2015 Sep 3;3:37. doi: 10.1186/s40168-015-0102-9.
5
Knowledge transfer via classification rules using functional mapping for integrative modeling of gene expression data.通过使用功能映射的分类规则进行知识转移,以对基因表达数据进行整合建模。
BMC Bioinformatics. 2015 Jul 23;16:226. doi: 10.1186/s12859-015-0643-8.
6
Application of high-dimensional feature selection: evaluation for genomic prediction in man.高维特征选择的应用:人类基因组预测评估
Sci Rep. 2015 May 19;5:10312. doi: 10.1038/srep10312.
7
Fine-scale analysis of 16S rRNA sequences reveals a high level of taxonomic diversity among vaginal Atopobium spp.对16S rRNA序列的精细分析揭示了阴道阿托波菌属物种之间高度的分类多样性。
Pathog Dis. 2015 Jun;73(4). doi: 10.1093/femspd/ftv020. Epub 2015 Mar 15.
8
Selection of models for the analysis of risk-factor trees: leveraging biological knowledge to mine large sets of risk factors with application to microbiome data.用于风险因素树分析的模型选择:利用生物学知识挖掘大量风险因素并应用于微生物组数据
Bioinformatics. 2015 May 15;31(10):1607-13. doi: 10.1093/bioinformatics/btu855. Epub 2015 Jan 6.
9
Ribosomal Database Project: data and tools for high throughput rRNA analysis.核糖体数据库项目:高通量 rRNA 分析的数据和工具。
Nucleic Acids Res. 2014 Jan;42(Database issue):D633-42. doi: 10.1093/nar/gkt1244. Epub 2013 Nov 27.
10
Hypothesis testing and power calculations for taxonomic-based human microbiome data.基于分类的人类微生物组数据的假设检验和功效计算。
PLoS One. 2012;7(12):e52078. doi: 10.1371/journal.pone.0052078. Epub 2012 Dec 20.