Rivas Pablo, Moore Sharon, Iwaniec Urszula T, Turner Russell T, Grant Kathy, Baker Erich
School of Computer Science and Mathematics, Marist College, Poughkeepsie, NY, USA.
Department of Computer Science, Baylor University, Waco, TX, USA.
Proc (Int Conf Comput Sci Comput Intell). 2018 Dec;2018:1357-1361. doi: 10.1109/csci46756.2018.00263.
We explore the effectiveness of Support Vector Machines (SVM) for classification in a sparse data set. Non-human primate models are utilized to analyze Alcohol Use Disorders (AUDs); however, the resulting data have a limited sample size. The challenge of low sample numbers and low replicates are explored using a variety of optimization strategies for feature extraction, including correlation, entropy, density, linear support vector machines for regression (SVR), backward SVR, and forward SVR. We investigate these approaches against the backdrop of the relationship between alcohol consumption and tibial bone mineral density. The results indicate that machine learning (ML) can effectively be used in cases of low and diverse biological data sets. The best relevance feature ranking strategies are correlation, SVR forward, and SVR backward.
我们探讨支持向量机(SVM)在稀疏数据集中进行分类的有效性。利用非人类灵长类动物模型来分析酒精使用障碍(AUDs);然而,所得数据的样本量有限。使用多种特征提取优化策略来探索样本数量少和重复次数少的挑战,这些策略包括相关性、熵、密度、线性回归支持向量机(SVR)、反向SVR和正向SVR。我们在酒精摄入量与胫骨骨密度之间关系的背景下研究这些方法。结果表明,机器学习(ML)可有效地用于生物数据集少且多样的情况。最佳的相关性特征排名策略是相关性、正向SVR和反向SVR。