• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于微生物组的分类模型在新鲜农产品安全和质量评价中的应用。

Microbiome-based classification models for fresh produce safety and quality evaluation.

机构信息

Department of Food Science and Technology, University of California Davis, Davis, California, USA.

Department of Molecular and Cellular Biology, University of California Davis, Davis, California, USA.

出版信息

Microbiol Spectr. 2024 Apr 2;12(4):e0344823. doi: 10.1128/spectrum.03448-23. Epub 2024 Mar 6.

DOI:10.1128/spectrum.03448-23
PMID:38445872
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10986475/
Abstract

UNLABELLED

Small sample sizes and loss of sequencing reads during the microbiome data preprocessing can limit the statistical power of differentiating fresh produce phenotypes and prevent the detection of important bacterial species associated with produce contamination or quality reduction. Here, we explored a machine learning-based -mer hash analysis strategy to identify DNA signatures predictive of produce safety (PS) and produce quality (PQ) and compared it against the amplicon sequence variant (ASV) strategy that uses a typical denoising step and ASV-based taxonomy strategy. Random forest-based classifiers for PS and PQ using 7-mer hash data sets had significantly higher classification accuracy than those using the ASV data sets. We also demonstrated that the proposed combination of integrating multiple data sets and leveraging a 7-mer hash strategy leads to better classification performance for PS and PQ compared to the ASV method but presents lower PS classification accuracy compared to the feature-selected ASV-based taxonomy strategy. Due to the current limitation of generating taxonomy using the 7-mer hash strategy, the ASV-based taxonomy strategy with remarkably less computing time and memory usage is more efficient for PS and PQ classification and applicable for important taxa identification. Results generated from this study lay the foundation for future studies that wish and need to incorporate and/or compare different microbiome sequencing data sets for the application of machine learning in the area of microbial safety and quality of food.

IMPORTANCE

Identification of generalizable indicators for produce safety (PS) and produce quality (PQ) improves the detection of produce contamination and quality decline. However, effective sequencing read loss during microbiome data preprocessing and the limited sample size of individual studies restrain statistical power to identify important features contributing to differentiating PS and PQ phenotypes. We applied machine learning-based models using individual and integrated -mer hash and amplicon sequence variant (ASV) data sets for PS and PQ classification and evaluated their classification performance and found that random forest (RF)-based models using integrated 7-mer hash data sets achieved significantly higher PS and PQ classification accuracy. Due to the limitation of taxonomic analysis for the 7-mer hash, we also developed RF-based models using feature-selected ASV-based taxonomic data sets, which performed better PS classification than those using the integrated 7-mer hash data set. The RF feature selection method identified 480 PS indicators and 263 PQ indicators with a positive contribution to the PS and PQ classification.

摘要

未加标签

在微生物组数据预处理过程中,小样本量和测序reads 的丢失会限制区分新鲜农产品表型的统计能力,并阻止检测与农产品污染或质量下降相关的重要细菌种类。在这里,我们探索了一种基于机器学习的-mer 哈希分析策略,以识别与农产品安全 (PS) 和农产品质量 (PQ) 相关的 DNA 特征,并将其与使用典型去噪步骤和基于 ASV 的分类策略的扩增子序列变异 (ASV) 策略进行比较。基于随机森林的 PS 和 PQ 7-mer hash 数据集分类器的分类准确性明显高于基于 ASV 数据集的分类器。我们还证明了,与 ASV 方法相比,整合多个数据集并利用 7-mer hash 策略的提议组合可实现更好的 PS 和 PQ 分类性能,但与基于特征选择的 ASV 分类策略相比,PS 分类准确性较低。由于当前使用 7-mer hash 策略生成分类的局限性,基于 ASV 的分类策略具有显著较少的计算时间和内存使用,因此更适用于 PS 和 PQ 分类,并且适用于重要分类群的鉴定。本研究的结果为未来的研究奠定了基础,这些研究希望并需要整合和/或比较不同的微生物组测序数据集,以将机器学习应用于食品微生物安全和质量领域。

重要性

识别农产品安全 (PS) 和农产品质量 (PQ) 的可推广指标可以提高对农产品污染和质量下降的检测能力。然而,在微生物组数据预处理过程中有效的测序 read 丢失以及个别研究的有限样本量限制了识别区分 PS 和 PQ 表型的重要特征的统计能力。我们应用了基于机器学习的模型,使用个体和整合的-mer hash 和扩增子序列变异 (ASV) 数据集进行 PS 和 PQ 分类,并评估了它们的分类性能,发现使用整合的 7-mer hash 数据集的基于随机森林 (RF) 的模型实现了更高的 PS 和 PQ 分类准确性。由于 7-mer hash 的分类分析限制,我们还开发了基于 RF 的模型,使用基于特征选择的 ASV 分类数据,其 PS 分类性能优于使用整合的 7-mer hash 数据集的模型。RF 特征选择方法识别了 480 个 PS 指标和 263 个 PQ 指标,它们对 PS 和 PQ 分类有积极贡献。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/4afda373a8fe/spectrum.03448-23.f006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/c753b626c8bc/spectrum.03448-23.f001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/79c08e9de5ea/spectrum.03448-23.f002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/a19345c29a54/spectrum.03448-23.f003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/0d82ba6d146b/spectrum.03448-23.f004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/23bf7ceaac09/spectrum.03448-23.f005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/4afda373a8fe/spectrum.03448-23.f006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/c753b626c8bc/spectrum.03448-23.f001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/79c08e9de5ea/spectrum.03448-23.f002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/a19345c29a54/spectrum.03448-23.f003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/0d82ba6d146b/spectrum.03448-23.f004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/23bf7ceaac09/spectrum.03448-23.f005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7ba2/10986475/4afda373a8fe/spectrum.03448-23.f006.jpg

相似文献

1
Microbiome-based classification models for fresh produce safety and quality evaluation.基于微生物组的分类模型在新鲜农产品安全和质量评价中的应用。
Microbiol Spectr. 2024 Apr 2;12(4):e0344823. doi: 10.1128/spectrum.03448-23. Epub 2024 Mar 6.
2
Microbiome Preprocessing Machine Learning Pipeline.微生物组预处理机器学习管道。
Front Immunol. 2021 Jun 18;12:677870. doi: 10.3389/fimmu.2021.677870. eCollection 2021.
3
Performance of Microbiome Sequence Inference Methods in Environments with Varying Biomass.微生物组序列推断方法在不同生物量环境中的性能
mSystems. 2019 Feb 19;4(1). doi: 10.1128/mSystems.00163-18. eCollection 2019 Jan-Feb.
4
Machine learning strategy for identifying altered gut microbiomes for diagnostic screening in myasthenia gravis.用于识别重症肌无力诊断筛查中肠道微生物群改变的机器学习策略
Front Microbiol. 2023 Sep 27;14:1227300. doi: 10.3389/fmicb.2023.1227300. eCollection 2023.
5
Identifying the minimum amplicon sequence depth to adequately predict classes in eDNA-based marine biomonitoring using supervised machine learning.在基于环境DNA的海洋生物监测中,利用监督式机器学习确定足以准确预测类别的最小扩增子序列深度。
Comput Struct Biotechnol J. 2021 Apr 26;19:2256-2268. doi: 10.1016/j.csbj.2021.04.005. eCollection 2021.
6
Cancer survival classification using integrated data sets and intermediate information.基于整合数据集和中间信息的癌症生存分类。
Artif Intell Med. 2014 Sep;62(1):23-31. doi: 10.1016/j.artmed.2014.06.003. Epub 2014 Jun 21.
7
Space-efficient computation of k-mer dictionaries for large values of k.针对大k值的k-mer字典进行节省空间的计算。
Algorithms Mol Biol. 2024 Apr 5;19(1):14. doi: 10.1186/s13015-024-00259-1.
8
Reference-Free Plant Disease Detection Using Machine Learning and Long-Read Metagenomic Sequencing.基于机器学习和长读长测序的免参考植物病害检测
Appl Environ Microbiol. 2023 Jun 28;89(6):e0026023. doi: 10.1128/aem.00260-23. Epub 2023 May 15.
9
Systematic Comparisons for Composition Profiles, Taxonomic Levels, and Machine Learning Methods for Microbiome-Based Disease Prediction.基于微生物组的疾病预测中组成概况、分类水平和机器学习方法的系统比较
Front Mol Biosci. 2020 Dec 16;7:610845. doi: 10.3389/fmolb.2020.610845. eCollection 2020.
10
Conditional Forest Models Built Using Metagenomic Data Accurately Predicted Salmonella Contamination in Northeastern Streams.利用宏基因组数据构建的条件森林模型准确预测了东北部溪流中的沙门氏菌污染。
Microbiol Spectr. 2023 Mar 22;11(2):e0038123. doi: 10.1128/spectrum.00381-23.

引用本文的文献

1
Next-generation sequencing applications in food science: fundamentals and recent advances.下一代测序技术在食品科学中的应用:基础与最新进展
Front Bioeng Biotechnol. 2025 Aug 20;13:1638957. doi: 10.3389/fbioe.2025.1638957. eCollection 2025.
2
Applying machine learning to classify table olives using bacterial metataxonomic data.应用机器学习通过细菌元分类数据对油橄榄进行分类。
NPJ Sci Food. 2025 Jul 4;9(1):121. doi: 10.1038/s41538-025-00496-7.

本文引用的文献

1
Combinations of waste seaweed liquid fertilizer and biochar on tomato (Solanum lycopersicum L.) seedling growth in an acid-affected soil of Jiaodong Peninsula, China.中国胶东半岛酸化土壤中,利用废弃海藻液肥和生物炭组合对番茄(Solanum lycopersicum L.)幼苗生长的影响。
Ecotoxicol Environ Saf. 2023 Jul 15;260:115075. doi: 10.1016/j.ecoenv.2023.115075. Epub 2023 Jun 1.
2
Reference-Free Plant Disease Detection Using Machine Learning and Long-Read Metagenomic Sequencing.基于机器学习和长读长测序的免参考植物病害检测
Appl Environ Microbiol. 2023 Jun 28;89(6):e0026023. doi: 10.1128/aem.00260-23. Epub 2023 May 15.
3
Prevalence and Characterization of Beta-Lactam and Carbapenem-Resistant Bacteria Isolated from Organic Fresh Produce Retailed in Eastern Spain.
从西班牙东部零售的有机新鲜农产品中分离出的耐β-内酰胺和碳青霉烯类细菌的流行情况及特征
Antibiotics (Basel). 2023 Feb 14;12(2):387. doi: 10.3390/antibiotics12020387.
4
Polysaccharides from Cordyceps militaris prevent obesity in association with modulating gut microbiota and metabolites in high-fat diet-fed mice.蛹虫草多糖通过调节高脂肪饮食喂养小鼠的肠道微生物群和代谢物来预防肥胖。
Food Res Int. 2022 Jul;157:111197. doi: 10.1016/j.foodres.2022.111197. Epub 2022 Apr 1.
5
The Microbial Quality of Commercial Chopped Romaine Lettuce Before and After the "Use By" Date.商业切碎的长叶生菜在“保质期”前后的微生物质量。
Front Microbiol. 2022 Apr 11;13:850720. doi: 10.3389/fmicb.2022.850720. eCollection 2022.
6
Alterations in common marmoset gut microbiome associated with duodenal strictures.与十二指肠狭窄相关的普通绒猴肠道微生物组的改变。
Sci Rep. 2022 Mar 28;12(1):5277. doi: 10.1038/s41598-022-09268-9.
7
Coexistence of antibiotic resistance genes, fecal bacteria, and potential pathogens in anthropogenically impacted water.人为干扰水中抗生素耐药基因、粪便细菌和潜在病原体的共存。
Environ Sci Pollut Res Int. 2022 Jul;29(31):46977-46990. doi: 10.1007/s11356-022-19175-1. Epub 2022 Feb 17.
8
Microbial Source Tracking Markers Perform Poorly in Predicting and Enteric Pathogen Contamination of Cow Milk Products and Milk-Containing Infant Food.微生物源追踪标记物在预测牛奶制品和含牛奶的婴儿食品的肠道病原体污染方面表现不佳。
Front Microbiol. 2022 Jan 4;12:778921. doi: 10.3389/fmicb.2021.778921. eCollection 2021.
9
Benefits of merging paired-end reads before pre-processing environmental metagenomics data.在预处理环境宏基因组数据之前合并配对末端reads 的好处。
Mar Genomics. 2022 Feb;61:100914. doi: 10.1016/j.margen.2021.100914. Epub 2021 Dec 2.
10
The restructuring of grape berry waxes by calcium changes the surface microbiota.钙对葡萄浆果蜡质的重构改变了表面微生物群。
Food Res Int. 2021 Dec;150(Pt B):110812. doi: 10.1016/j.foodres.2021.110812. Epub 2021 Nov 16.