School of Physics and Optoelectronic Engineering, Yangtze University, Jingzhou 434023, China.
Institute of Applied Chemistry, College of Material and Chemical Engineering, Tongren University, Tongren 554300, China.
Talanta. 2018 Aug 15;186:489-496. doi: 10.1016/j.talanta.2018.04.081. Epub 2018 Apr 28.
Metabonomics has been widely used in disease diagnosis and clinically practical methods often require the detection of multi-class bio-samples. In this work, multi-class classification methods were investigated to simultaneously discriminate among 6 inherited metabolic diseases (IMDs) and the normal instances using gas chromatography-mass spectrometry (GC-MS) of urine samples. Two common multi-class classification strategies, one-against-all (OAA) and one-against-one (OAO) were compared and enhanced using a novel ensemble classification strategy (ECS), which developed a set of sequential sub-classifiers by fusion of OAA and OAO and made the final classification decisions using softmax function. GC-MS data of 240 instances of 6 IMDs and healthy controls were classified by different strategies based on orthogonal partial least squares discriminant analysis (OPLS-DA) and particle swarm optimization (PSO) algorithm was performed for feature selection. By OAA and OAO, the classification accuracies were 70.00% and 82.86%, respectively. Using the two methods based on ECS, the total classification accuracies were 0.9143 and 0.9429. The newly proposed ECS will provide a useful multi-class classification tool for simultaneous detection of clinically similar IMDs and promote practical and reliable diagnosis of IMDs using metabonomics data.
代谢组学已广泛应用于疾病诊断,临床实际方法通常需要检测多类生物样本。本研究采用气相色谱-质谱联用(GC-MS)分析尿液样本,探讨多类分类方法,以同时区分 6 种遗传性代谢疾病(IMD)和正常样本。比较了两种常见的多类分类策略,即一对一对抗(OAA)和一对多对抗(OAO),并使用一种新的集成分类策略(ECS)进行了增强,该策略通过融合 OAA 和 OAO 开发了一组顺序子分类器,并使用 softmax 函数做出最终分类决策。基于正交偏最小二乘判别分析(OPLS-DA)和粒子群优化(PSO)算法对 240 例 6 种 IMD 和健康对照的 GC-MS 数据进行分类,进行特征选择。通过 OAA 和 OAO,分类准确率分别为 70.00%和 82.86%。使用基于 ECS 的两种方法,总分类准确率分别为 0.9143 和 0.9429。新提出的 ECS 将为同时检测临床上相似的 IMD 提供有用的多类分类工具,并通过代谢组学数据促进 IMD 的实际可靠诊断。