Suppr超能文献

监督自组织映射在分类和潜在判别变量确定中的应用:以核磁共振代谢组学分析为例。

Supervised self organizing maps for classification and determination of potentially discriminatory variables: illustrated by application to nuclear magnetic resonance metabolomic profiling.

机构信息

Centre of Chemometrics, School of Chemistry, University of Bristol, Cantocks Close, Bristol, BS8 1TS, UK.

出版信息

Anal Chem. 2010 Jan 15;82(2):628-38. doi: 10.1021/ac9020566.

Abstract

The article describes the extension of the self organizing maps discrimination index (SOMDI) for cases where there are more than two classes and more than one factor that may influence the group of samples by using supervised SOMs to determine which variables and how many are responsible for the different types of separation. The methods are illustrated by an application in the area of metabolic profiling, consisting of a nuclear magnetic resonance (NMR) data set of 96 samples of human saliva, which is characterized by three factors, namely, whether the sample has been treated or not, 16 donors, and 3 sampling days, differing for each donor. The sampling days can be considered a null factor as they should have no significant influence on the metabolic profile. Methods for supervised SOMs involve including a classifier for organizing the map, and we report a method for optimizing this by using an additional weight that determines the relative importance of the classifier relative to the overall experimental data set in order to avoid overfitting. Supervised SOMs can be obtained for each of the three factors, and we develop a multiclass SOM discrimination index (SOMDI) to determine which variables (or regions of the NMR spectra) are considered significant for each of the three potential factors. By dividing the data iteratively into training and test sets 100 times, we define variables as significant for a given factor if they have a positive SOMDI in the training set for the factor and class of interest over all iterations.

摘要

本文描述了如何将自组织映射判别指数(SOMDI)扩展到两种以上类别和一种以上可能影响样本组的因素的情况,通过使用有监督的 SOM 来确定哪些变量以及有多少变量负责不同类型的分离。该方法通过代谢组学领域的一个应用来说明,该应用包括 96 个人类唾液样本的核磁共振(NMR)数据集,该数据集有三个因素,即样本是否经过处理、16 个供体和 3 个采样日,每个供体都不同。采样日可以被视为一个无效因素,因为它们不应对代谢谱产生重大影响。有监督 SOM 的方法涉及包括一个用于组织地图的分类器,我们报告了一种通过使用额外的权重来优化这种方法的方法,该权重确定了分类器相对于整个实验数据集的相对重要性,以避免过度拟合。可以为每个三个因素获得有监督的 SOM,我们开发了一种多类 SOM 判别指数(SOMDI),以确定哪些变量(或 NMR 光谱的哪些区域)对于三个潜在因素中的每一个是重要的。通过将数据迭代地分为 100 次训练集和测试集,我们将变量定义为如果在给定因素和感兴趣的类的所有迭代中,它们在该因素的训练集中具有正 SOMDI,则对给定因素具有重要意义。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验