• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CAMDA 2023:探寻城市微生物群落中的模式。

CAMDA 2023: Finding patterns in urban microbiomes.

作者信息

Contreras-Peruyero Haydeé, Nuñez Imanol, Vazquez-Rosas-Landa Mirna, Santana-Quinteros Daniel, Pashkov Antón, Carranza-Barragán Mario E, Perez-Estrada Rafael, Guerrero-Flores Shaday, Balanzario Eugenio, Muñiz Sánchez Víctor, Nakamura Miguel, Ramírez-Ramírez L Leticia, Sélem-Mojica Nelly

机构信息

Centro de Ciencias Matemáticas, Universidad Nacional Autónoma de México, Morelia, Mexico.

Centro de Investigación en Matemáticas, A.C., Guanajuato, Mexico.

出版信息

Front Genet. 2024 Nov 25;15:1449461. doi: 10.3389/fgene.2024.1449461. eCollection 2024.

DOI:10.3389/fgene.2024.1449461
PMID:39655221
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11625776/
Abstract

The Critical Assessment of Massive Data Analysis (CAMDA) addresses the complexities of harnessing Big Data in life sciences by hosting annual competitions that inspire research groups to develop innovative solutions. In 2023, the Forensic Challenge focused on identifying the city of origin for 365 metagenomic samples collected from public transportation systems and identifying associations between bacterial distribution and other covariates. For microbiome classification, we incorporated both taxonomic and functional annotations as features. To identify the most informative Operational Taxonomic Units, we selected features by fitting negative binomial models. We then implemented supervised models conducting 5-fold cross-validation (CV) with a 4:1 training-to-validation ratio. After variable selection, which reduced the dataset to fewer than 300 OTUs, the Support Vector Classifier achieved the highest F1 score (0.96). When using functional features from MIFASER, the Neural Network model outperformed other models. When considering climatic and demographic variables of the cities, Dirichlet regression over , , and bacteria abundances suggests that population increase is indeed associated with a rise in the mean of while decreasing temperature is linked to higher proportions of . This study validates microbiome classification using taxonomic features and, to a lesser extent, functional features. It shows that demographic and climatic factors influence urban microbial distribution. A Docker container and a Conda environment are available at the repository: GitHub facilitating broader adoption and validation of these methods by the scientific community.

摘要

大规模数据分析关键评估(CAMDA)通过举办年度竞赛来解决生命科学中利用大数据的复杂性问题,这些竞赛激励研究团队开发创新解决方案。2023年,法医挑战赛聚焦于确定从公共交通系统收集的365个宏基因组样本的来源城市,并确定细菌分布与其他协变量之间的关联。对于微生物组分类,我们将分类学和功能注释都纳入特征中。为了确定信息量最大的操作分类单元,我们通过拟合负二项式模型来选择特征。然后,我们实施监督模型,以4:1的训练与验证比例进行5折交叉验证(CV)。在进行变量选择后(将数据集减少到300个以下的操作分类单元),支持向量分类器获得了最高的F1分数(0.96)。当使用MIFASER的功能特征时,神经网络模型优于其他模型。在考虑城市的气候和人口统计变量时,对、和细菌丰度进行狄利克雷回归表明,人口增长确实与的平均值上升有关,而温度下降则与的比例较高有关。本研究验证了使用分类学特征以及在较小程度上使用功能特征进行微生物组分类。研究表明,人口和气候因素会影响城市微生物分布。在存储库GitHub上提供了一个Docker容器和一个Conda环境,便于科学界更广泛地采用和验证这些方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e9f/11625776/67f7a1026b03/fgene-15-1449461-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e9f/11625776/9e84f5154ac1/fgene-15-1449461-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e9f/11625776/28be8f6a05d2/fgene-15-1449461-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e9f/11625776/7a395825b743/fgene-15-1449461-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e9f/11625776/360ffafadc02/fgene-15-1449461-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e9f/11625776/67f7a1026b03/fgene-15-1449461-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e9f/11625776/9e84f5154ac1/fgene-15-1449461-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e9f/11625776/28be8f6a05d2/fgene-15-1449461-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e9f/11625776/7a395825b743/fgene-15-1449461-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e9f/11625776/360ffafadc02/fgene-15-1449461-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3e9f/11625776/67f7a1026b03/fgene-15-1449461-g005.jpg

相似文献

1
CAMDA 2023: Finding patterns in urban microbiomes.CAMDA 2023:探寻城市微生物群落中的模式。
Front Genet. 2024 Nov 25;15:1449461. doi: 10.3389/fgene.2024.1449461. eCollection 2024.
2
Identification of city specific important bacterial signature for the MetaSUB CAMDA challenge microbiome data.鉴定城市特有重要细菌特征,用于 MetaSUB CAMDA 挑战赛微生物组数据。
Biol Direct. 2019 Jul 24;14(1):11. doi: 10.1186/s13062-019-0243-z.
3
Application of machine learning techniques for creating urban microbial fingerprints.应用机器学习技术构建城市微生物指纹图谱。
Biol Direct. 2019 Aug 16;14(1):13. doi: 10.1186/s13062-019-0245-x.
4
Systematic evaluation of supervised machine learning for sample origin prediction using metagenomic sequencing data.基于宏基因组测序数据的样本来源预测的有监督机器学习方法的系统评价。
Biol Direct. 2020 Dec 10;15(1):29. doi: 10.1186/s13062-020-00287-y.
5
Unraveling city-specific signature and identifying sample origin locations for the data from CAMDA MetaSUB challenge.解析 CAMDA MetaSUB 挑战赛数据的城市特定特征并识别样本来源位置。
Biol Direct. 2021 Jan 4;16(1):1. doi: 10.1186/s13062-020-00284-1.
6
Unraveling City-Specific Microbial Signatures and Identifying Sample Origins for the Data From CAMDA 2020 Metagenomic Geolocation Challenge.解析特定城市的微生物特征并为2020年CAMDA宏基因组地理定位挑战赛的数据确定样本来源。
Front Genet. 2021 Aug 5;12:659650. doi: 10.3389/fgene.2021.659650. eCollection 2021.
7
Fingerprinting cities: differentiating subway microbiome functionality.城市指纹:区分地铁微生物组功能。
Biol Direct. 2019 Oct 30;14(1):19. doi: 10.1186/s13062-019-0252-y.
8
A machine learning framework to determine geolocations from metagenomic profiling.基于宏基因组分析的地理位置确定机器学习框架。
Biol Direct. 2020 Nov 23;15(1):27. doi: 10.1186/s13062-020-00278-z.
9
Massive metagenomic data analysis using abundance-based machine learning.基于丰度的机器学习在海量宏基因组数据分析中的应用。
Biol Direct. 2019 Aug 1;14(1):12. doi: 10.1186/s13062-019-0242-0.
10
Unraveling bacterial fingerprints of city subways from microbiome 16S gene profiles.从微生物组 16S 基因图谱中揭示城市地铁的细菌指纹。
Biol Direct. 2018 May 22;13(1):10. doi: 10.1186/s13062-018-0215-8.

本文引用的文献

1
Where environmental microbiome meets its host: Subway and passenger microbiome relationships.环境微生物群落与其宿主相遇之处:地铁与乘客微生物群落的关系。
Mol Ecol. 2023 May;32(10):2602-2618. doi: 10.1111/mec.16440. Epub 2022 Apr 4.
2
Unraveling City-Specific Microbial Signatures and Identifying Sample Origins for the Data From CAMDA 2020 Metagenomic Geolocation Challenge.解析特定城市的微生物特征并为2020年CAMDA宏基因组地理定位挑战赛的数据确定样本来源。
Front Genet. 2021 Aug 5;12:659650. doi: 10.3389/fgene.2021.659650. eCollection 2021.
3
A global metagenomic map of urban microbiomes and antimicrobial resistance.
城市微生物组和抗药性的全球宏基因组图谱。
Cell. 2021 Jun 24;184(13):3376-3393.e17. doi: 10.1016/j.cell.2021.05.002. Epub 2021 May 26.
4
Metagenomic Geolocation Prediction Using an Adaptive Ensemble Classifier.使用自适应集成分类器的宏基因组地理定位预测
Front Genet. 2021 Apr 20;12:642282. doi: 10.3389/fgene.2021.642282. eCollection 2021.
5
Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data.宏基因组测序数据中抗菌药物耐药性的起源样本预测与空间建模
Front Genet. 2021 Mar 4;12:642991. doi: 10.3389/fgene.2021.642991. eCollection 2021.
6
Unraveling city-specific signature and identifying sample origin locations for the data from CAMDA MetaSUB challenge.解析 CAMDA MetaSUB 挑战赛数据的城市特定特征并识别样本来源位置。
Biol Direct. 2021 Jan 4;16(1):1. doi: 10.1186/s13062-020-00284-1.
7
Skin Microbiome and its Interplay with the Environment.皮肤微生物群及其与环境的相互作用。
Am J Clin Dermatol. 2020 Sep;21(Suppl 1):4-11. doi: 10.1007/s40257-020-00551-x.
8
Passenger-surface microbiome interactions in the subway of Mexico City.墨西哥城地铁中的乘客-表面微生物组相互作用。
PLoS One. 2020 Aug 19;15(8):e0237272. doi: 10.1371/journal.pone.0237272. eCollection 2020.
9
Station and train surface microbiomes of Mexico City's metro (subway/underground).墨西哥城地铁(地下铁道)车站和列车表面微生物组。
Sci Rep. 2020 May 29;10(1):8798. doi: 10.1038/s41598-020-65643-4.
10
Improved metagenomic analysis with Kraken 2.Kraken 2 提升宏基因组分析。
Genome Biol. 2019 Nov 28;20(1):257. doi: 10.1186/s13059-019-1891-0.