• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

合成噪声与机器学习在分析放射性生态数据集方面的优势

Advantages of Synthetic Noise and Machine Learning for Analyzing Radioecological Data Sets.

作者信息

Shuryak Igor

机构信息

Center for Radiological Research, Columbia University, New York, New York, United States of America.

出版信息

PLoS One. 2017 Jan 9;12(1):e0170007. doi: 10.1371/journal.pone.0170007. eCollection 2017.

DOI:10.1371/journal.pone.0170007
PMID:28068401
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5222373/
Abstract

The ecological effects of accidental or malicious radioactive contamination are insufficiently understood because of the hazards and difficulties associated with conducting studies in radioactively-polluted areas. Data sets from severely contaminated locations can therefore be small. Moreover, many potentially important factors, such as soil concentrations of toxic chemicals, pH, and temperature, can be correlated with radiation levels and with each other. In such situations, commonly-used statistical techniques like generalized linear models (GLMs) may not be able to provide useful information about how radiation and/or these other variables affect the outcome (e.g. abundance of the studied organisms). Ensemble machine learning methods such as random forests offer powerful alternatives. We propose that analysis of small radioecological data sets by GLMs and/or machine learning can be made more informative by using the following techniques: (1) adding synthetic noise variables to provide benchmarks for distinguishing the performances of valuable predictors from irrelevant ones; (2) adding noise directly to the predictors and/or to the outcome to test the robustness of analysis results against random data fluctuations; (3) adding artificial effects to selected predictors to test the sensitivity of the analysis methods in detecting predictor effects; (4) running a selected machine learning method multiple times (with different random-number seeds) to test the robustness of the detected "signal"; (5) using several machine learning methods to test the "signal's" sensitivity to differences in analysis techniques. Here, we applied these approaches to simulated data, and to two published examples of small radioecological data sets: (I) counts of fungal taxa in samples of soil contaminated by the Chernobyl nuclear power plan accident (Ukraine), and (II) bacterial abundance in soil samples under a ruptured nuclear waste storage tank (USA). We show that the proposed techniques were advantageous compared with the methodology used in the original publications where the data sets were presented. Specifically, our approach identified a negative effect of radioactive contamination in data set I, and suggested that in data set II stable chromium could have been a stronger limiting factor for bacterial abundance than the radionuclides 137Cs and 99Tc. This new information, which was extracted from these data sets using the proposed techniques, can potentially enhance the design of radioactive waste bioremediation.

摘要

由于在放射性污染地区开展研究存在诸多危险和困难,人们对意外或恶意放射性污染的生态影响了解不足。因此,来自严重污染地区的数据集可能很小。此外,许多潜在的重要因素,如有毒化学物质的土壤浓度、pH值和温度,可能与辐射水平相互关联。在这种情况下,常用的统计技术,如广义线性模型(GLMs),可能无法提供有关辐射和/或这些其他变量如何影响结果(如所研究生物的丰度)的有用信息。诸如随机森林等集成机器学习方法提供了强大的替代方案。我们认为,通过使用以下技术,可以使GLMs和/或机器学习对小型放射生态数据集的分析更具信息性:(1)添加合成噪声变量,为区分有价值的预测变量和无关预测变量的性能提供基准;(2)直接向预测变量和/或结果添加噪声,以测试分析结果对随机数据波动的稳健性;(3)向选定的预测变量添加人为效应,以测试分析方法在检测预测变量效应方面的敏感性;(4)多次运行选定的机器学习方法(使用不同的随机数种子),以测试检测到的“信号”的稳健性;(5)使用多种机器学习方法来测试“信号”对分析技术差异的敏感性。在这里,我们将这些方法应用于模拟数据,以及两个已发表的小型放射生态数据集示例:(I)受切尔诺贝利核电站事故(乌克兰)污染的土壤样本中真菌类群的计数,以及(II)美国一个破裂的核废料储存罐下土壤样本中的细菌丰度。我们表明,与原始出版物中呈现数据集时使用的方法相比,所提出的技术具有优势。具体而言,我们的方法在数据集I中识别出放射性污染的负面影响,并表明在数据集II中,稳定铬可能比放射性核素137Cs和99Tc对细菌丰度的限制作用更强。使用所提出的技术从这些数据集中提取的这些新信息,有可能加强放射性废物生物修复的设计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddbc/5222373/4c05a4deae1a/pone.0170007.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddbc/5222373/837bcdb7a83a/pone.0170007.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddbc/5222373/89105971c942/pone.0170007.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddbc/5222373/336a26762673/pone.0170007.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddbc/5222373/d60748d2e281/pone.0170007.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddbc/5222373/4c05a4deae1a/pone.0170007.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddbc/5222373/837bcdb7a83a/pone.0170007.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddbc/5222373/89105971c942/pone.0170007.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddbc/5222373/336a26762673/pone.0170007.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddbc/5222373/d60748d2e281/pone.0170007.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ddbc/5222373/4c05a4deae1a/pone.0170007.g005.jpg

相似文献

1
Advantages of Synthetic Noise and Machine Learning for Analyzing Radioecological Data Sets.合成噪声与机器学习在分析放射性生态数据集方面的优势
PLoS One. 2017 Jan 9;12(1):e0170007. doi: 10.1371/journal.pone.0170007. eCollection 2017.
2
Quantitative Modeling of Microbial Population Responses to Chronic Irradiation Combined with Other Stressors.微生物种群对慢性辐射与其他应激源联合作用的定量建模
PLoS One. 2016 Jan 25;11(1):e0147696. doi: 10.1371/journal.pone.0147696. eCollection 2016.
3
Modeling species richness and abundance of phytoplankton and zooplankton in radioactively contaminated water bodies.模拟受放射性污染水体中浮游植物和浮游动物的物种丰富度与丰度。
J Environ Radioact. 2018 Dec;192:14-25. doi: 10.1016/j.jenvrad.2018.05.016. Epub 2018 Jun 6.
4
[The transfer of 90Sr and of 137Cs radionuclides in the chain of soil-fodder-animal products in the area contaminated as a consequence of the Chernobyl AES accident].[切尔诺贝利核电站事故污染地区土壤-饲料-动物产品链中90锶和137铯放射性核素的转移]
Radiats Biol Radioecol. 2006 Jan-Feb;46(1):77-81.
5
Evaluation of the Relationship between Current Internal 137Cs Exposure in Residents and Soil Contamination West of Chernobyl in Northern Ukraine.乌克兰北部切尔诺贝利以西地区居民当前体内铯-137暴露与土壤污染之间关系的评估。
PLoS One. 2015 Sep 24;10(9):e0139007. doi: 10.1371/journal.pone.0139007. eCollection 2015.
6
Relationships between radiation, wildfire and the soil microbial communities in the Chornobyl Exclusion Zone.切尔诺贝利隔离区的辐射、野火与土壤微生物群落之间的关系。
Sci Total Environ. 2024 Nov 10;950:175381. doi: 10.1016/j.scitotenv.2024.175381. Epub 2024 Aug 8.
7
[Complexes of soil micromycetes in the area of the influence of the Chernobyl Atomic Electric Power Station].[切尔诺贝利原子能发电站影响区域内土壤微真菌复合体]
Mikrobiol Zh (1978). 1991 Jul-Aug;53(4):3-9.
8
[The change in efficiency of protective measures for reduction of 137Cs accumulation by agricultural plants in various periods after the Chernobyl accident].[切尔诺贝利事故后不同时期农业植物减少¹³⁷Cs积累的防护措施效率变化]
Radiats Biol Radioecol. 2011 Jan-Feb;51(1):134-53.
9
[The dynamics of the fungal mycelial content in the soils of stationary posts in a 30-kilometer zone around the Chernobyl Atomic Electric Power Station].[切尔诺贝利原子能发电站周边30公里区域内固定监测点土壤中真菌菌丝体含量的动态变化]
Mikrobiol Z. 1993 Jun-Aug;55(4):8-15.
10
[Ecological consequences of radioactive pollution for soil bacteria within the 10-km region around the Chernobyl Atomic Energy Station].[切尔诺贝利原子能电站周边10公里区域内放射性污染对土壤细菌的生态影响]
Mikrobiologiia. 1998 Mar-Apr;67(2):274-80.

引用本文的文献

1
New Approaches for Quantitative Reconstruction of Radiation Dose in Human Blood Cells.新方法定量重建人血细胞核内辐射剂量
Sci Rep. 2019 Dec 5;9(1):18441. doi: 10.1038/s41598-019-54967-5.
2
Chronic gamma radiation resistance in fungi correlates with resistance to chromium and elevated temperatures, but not with resistance to acute irradiation.真菌的慢性伽马辐射抗性与对铬和高温的抗性有关,但与对急性照射的抗性无关。
Sci Rep. 2019 Aug 6;9(1):11361. doi: 10.1038/s41598-019-47007-9.

本文引用的文献

1
The Effect of Splitting on Random Forests.分裂对随机森林的影响。
Mach Learn. 2015 Apr;99(1):75-118. doi: 10.1007/s10994-014-5451-2. Epub 2014 Jul 2.
2
r2VIM: A new variable selection method for random forests in genome-wide association studies.r2VIM:全基因组关联研究中随机森林的一种新变量选择方法。
BioData Min. 2016 Feb 1;9:7. doi: 10.1186/s13040-016-0087-3. eCollection 2016.
3
Variable selection method for the identification of epistatic models.用于识别上位性模型的变量选择方法。
Pac Symp Biocomput. 2015;20:195-206.
4
Prediction of fishing effort distributions using boosted regression trees.使用提升回归树预测捕捞努力量分布。
Ecol Appl. 2014 Jan;24(1):71-83. doi: 10.1890/12-0826.1.
5
Random generalized linear model: a highly accurate and interpretable ensemble predictor.随机广义线性模型:一种高度准确且可解释的集成预测器。
BMC Bioinformatics. 2013 Jan 16;14:5. doi: 10.1186/1471-2105-14-5.
6
Actinide and metal toxicity to prospective bioremediation bacteria.锕系元素和金属对潜在生物修复细菌的毒性。
Environ Microbiol. 2005 Jan;7(1):88-97. doi: 10.1111/j.1462-2920.2004.00666.x.
7
Geomicrobiology of high-level nuclear waste-contaminated vadose sediments at the hanford site, washington state.华盛顿州汉福德场址高放核废物污染包气带沉积物的地质微生物学
Appl Environ Microbiol. 2004 Jul;70(7):4230-41. doi: 10.1128/AEM.70.7.4230-4241.2004.