Department of Critical Care Medicine, Jinhua Municipal Central Hospital, Jinhua Hospital of Zhejiang University, Jinhua 321000, China.
Ann Transl Med. 2016 Jun;4(11):218. doi: 10.21037/atm.2016.03.37.
Machine learning techniques have been widely used in many scientific fields, but its use in medical literature is limited partly because of technical difficulties. k-nearest neighbors (kNN) is a simple method of machine learning. The article introduces some basic ideas underlying the kNN algorithm, and then focuses on how to perform kNN modeling with R. The dataset should be prepared before running the knn() function in R. After prediction of outcome with kNN algorithm, the diagnostic performance of the model should be checked. Average accuracy is the mostly widely used statistic to reflect the kNN algorithm. Factors such as k value, distance calculation and choice of appropriate predictors all have significant impact on the model performance.
机器学习技术已被广泛应用于多个科学领域,但因其技术上的困难,在医学文献中的应用仍受到限制。k 近邻(kNN)是一种简单的机器学习方法。本文介绍了 kNN 算法的一些基本思想,然后重点介绍了如何使用 R 执行 kNN 建模。在 R 中的 knn()函数运行之前,应准备好数据集。使用 kNN 算法预测结果后,应检查模型的诊断性能。平均准确率是最广泛用于反映 kNN 算法的统计量。k 值、距离计算和适当预测因子的选择等因素都会对模型性能产生显著影响。