• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

地球化学数据库中的多变量异常值检测与修复

Multivariate outlier detection and remediation in geochemical databases.

作者信息

Lalor G C, Zhang C

机构信息

International Centre for Environmental and Nuclear Sciences, University of the West Indies, Kingston, Jamaica.

出版信息

Sci Total Environ. 2001 Dec 17;281(1-3):99-109. doi: 10.1016/s0048-9697(01)00839-7.

DOI:10.1016/s0048-9697(01)00839-7
PMID:11778964
Abstract

In this study, outliers are classified into three types: (1) range outliers; (2) spatial outliers; and (3) relationship outliers, defined as observations that fall outside of the values expected from correlation within the dataset. The multivariate methods of principal component analysis (PCA), multiple regression analysis (MRA) and an autoassociation neural network (AutoNN) method are applied to a dataset comprising 203 samples of rare earth element (REE) concentrations in soils of Jamaica which shows the expected good correlations between the elements. PCA is shown to be effective in detection of high value range outliers, while AutoNN and MRA are effective in detection of relationship outliers. A backpropagation neural network was used to predict the 'expected values' of the outliers. Four obvious relationship outliers with unexpected low Sm concentrations were selected as an example for remediation. The predicted Sm values were confirmed on remeasurement. Neural network methods, with the advantages of being model-free and effective in solving non-linear relationship problems, appear to provide an automated and effective way for the quality control of environmental databases.

摘要

在本研究中,异常值分为三种类型:(1)范围异常值;(2)空间异常值;(3)关系异常值,定义为落在数据集中相关性预期值范围之外的观测值。主成分分析(PCA)、多元回归分析(MRA)和自联想神经网络(AutoNN)方法等多元方法应用于包含牙买加土壤中203个稀土元素(REE)浓度样本的数据集,该数据集显示了元素之间预期的良好相关性。结果表明,PCA在检测高值范围异常值方面有效,而AutoNN和MRA在检测关系异常值方面有效。使用反向传播神经网络预测异常值的“预期值”。选择了四个具有意外低钐浓度的明显关系异常值作为修复示例。重新测量时确认了预测的钐值。神经网络方法具有无需模型且有效解决非线性关系问题的优点,似乎为环境数据库的质量控制提供了一种自动化且有效的方法。

相似文献

1
Multivariate outlier detection and remediation in geochemical databases.地球化学数据库中的多变量异常值检测与修复
Sci Total Environ. 2001 Dec 17;281(1-3):99-109. doi: 10.1016/s0048-9697(01)00839-7.
2
Outliers detection in multivariate time series by independent component analysis.基于独立成分分析的多元时间序列异常值检测
Neural Comput. 2007 Jul;19(7):1962-84. doi: 10.1162/neco.2007.19.7.1962.
3
Application of factorial kriging analysis to the FOREGS European topsoil geochemistry database.析因克里金分析在FOREGS欧洲表土地球化学数据库中的应用。
Sci Total Environ. 2008 Apr 1;393(1):96-110. doi: 10.1016/j.scitotenv.2007.12.012. Epub 2008 Jan 28.
4
Outlier identification and visualization for Pb concentrations in urban soils and its implications for identification of potential contaminated land.城市土壤中 Pb 浓度的异常值识别和可视化及其对潜在污染土地识别的意义。
Environ Pollut. 2009 Nov;157(11):3083-90. doi: 10.1016/j.envpol.2009.05.044. Epub 2009 Jun 13.
5
Geochemical Characteristics and Preliminary Assessment of Geochemical Threshold Values of Technology-Critical Elements in Soils Developed on Different Geological Substrata Along the Sava River Headwaters (Slovenia, Croatia).沿萨瓦河源头(斯洛文尼亚、克罗地亚)不同地质基底发育土壤中的技术关键元素的地球化学特征及地球化学阈值的初步评估。
Arch Environ Contam Toxicol. 2021 Nov;81(4):541-552. doi: 10.1007/s00244-020-00781-4. Epub 2020 Nov 19.
6
Should classification as an ACS-NSQIP high outlier be used to direct hospital quality improvement efforts?是否应将分类为 ACS-NSQIP 高离群值作为指导医院质量改进工作的依据?
Am J Surg. 2018 Aug;216(2):213-216. doi: 10.1016/j.amjsurg.2017.07.026. Epub 2017 Jul 21.
7
Potentially toxic elements in urban soils: source apportionment and contamination assessment.城市土壤中的潜在有毒元素:来源解析与污染评估。
Environ Monit Assess. 2018 Nov 12;190(12):715. doi: 10.1007/s10661-018-7066-8.
8
Geochemical background--concept and reality.地球化学背景——概念与实际情况
Sci Total Environ. 2005 Nov 1;350(1-3):12-27. doi: 10.1016/j.scitotenv.2005.01.047.
9
Self-organizing feature map (neural networks) as a tool in classification of the relations between chemical composition of aquatic bryophytes and types of streambeds in the Tatra national park in Poland.自组织特征映射(神经网络)作为一种工具,用于对波兰塔特拉国家公园水生苔藓植物化学成分与河床类型之间的关系进行分类。
Chemosphere. 2007 Mar;67(5):954-60. doi: 10.1016/j.chemosphere.2006.11.001. Epub 2006 Dec 12.
10
Detecting outlier samples in microarray data.检测微阵列数据中的异常样本。
Stat Appl Genet Mol Biol. 2009;8:Article 13. doi: 10.2202/1544-6115.1426. Epub 2009 Feb 11.

引用本文的文献

1
Extreme Longevity: Analysis of the Direct or Indirect Influence of Environmental Factors on Old, Nonagenarians, and Centenarians in Cilento, Italy.极端长寿:意大利奇伦托地区环境因素对高龄老人、90 岁以上老人和百岁老人的直接或间接影响分析。
Int J Environ Res Public Health. 2022 Jan 30;19(3):1589. doi: 10.3390/ijerph19031589.
2
DNLC: differential network local consistency analysis.DNLC:差异网络局部一致性分析。
BMC Bioinformatics. 2019 Dec 24;20(Suppl 15):489. doi: 10.1186/s12859-019-3046-4.
3
Detection of outliers in pollutant emissions from the Soto de Ribera coal-fired power plant using functional data analysis: a case study in northern Spain.
利用功能数据分析检测西班牙北部索托德里贝拉燃煤电厂污染物排放中的异常值:案例研究。
Environ Sci Pollut Res Int. 2020 Jan;27(1):8-20. doi: 10.1007/s11356-019-04435-4. Epub 2019 Feb 15.
4
Does irrigation with reclaimed water significantly pollute shallow aquifer with nitrate and salinity? An assay in a perurban area in North Tunisia.再生水灌溉是否会显著污染浅层含水层中的硝酸盐和盐分?突尼斯北部城郊地区的一项分析。
Environ Monit Assess. 2014 Jul;186(7):4367-90. doi: 10.1007/s10661-014-3705-x. Epub 2014 Mar 28.
5
Intelligent Interfaces for Mining Large-Scale RNAi-HCS Image Databases.用于挖掘大规模RNA干扰-高内涵筛选图像数据库的智能接口
Proc IEEE Int Symp Bioinformatics Bioeng. 2007 Nov 5;2007:1333-1337. doi: 10.1109/BIBE.2007.4375742.