International UNESCO Center for Health Related Basic Sciences and Human Nutrition, Mashhad University of Medical Sciences, Mashhad, Iran.
Student Research Committee, Mashhad University of Medical Sciences, Mashhad, Iran.
Crit Rev Clin Lab Sci. 2021 Jun;58(4):275-296. doi: 10.1080/10408363.2020.1857681. Epub 2021 Mar 19.
Data mining involves the use of mathematical sciences, statistics, artificial intelligence, and machine learning to determine the relationships between variables from a large sample of data. It has previously been shown that data mining can improve the prediction and diagnostic precision of type 2 diabetes mellitus. A few studies have applied machine learning to assess hypertension and metabolic syndrome-related biomarkers, as well as refine the assessment of cardiovascular disease risk. Machine learning methods have also been applied to assess new biomarkers and survival outcomes in patients with renal diseases to predict the development of chronic kidney disease, disease progression, and renal graft survival. In the latter, random forest methods were found to be the best for the prediction of chronic kidney disease. Some studies have investigated the prognosis of nonalcoholic fatty liver disease and acute liver failure, as well as therapy response prediction in patients with viral disorders, using decision tree models. Machine learning techniques, such as Sparse High-Order Interaction Model with Rejection Option, have been used for diagnosing Alzheimer's disease. Data mining techniques have also been applied to identify the risk factors for serious mental illness, such as depression and dementia, and help to diagnose and predict the quality of life of such patients. In relation to child health, some studies have determined the best algorithms for predicting obesity and malnutrition. Machine learning has determined the important risk factors for preterm birth and low birth weight. Published studies of patients with cancer and bacterial diseases are limited and should perhaps be addressed more comprehensively in future studies. Herein, we provide an in-depth review of studies in which biochemical biomarker data were analyzed using machine learning methods to assess the risk of several common diseases, in order to summarize the potential applications of data mining methods in clinical diagnosis. Data mining techniques have now been increasingly applied to clinical diagnostics, and they have the potential to support this field.
数据挖掘涉及使用数学科学、统计学、人工智能和机器学习来确定从大量数据样本中变量之间的关系。以前已经表明,数据挖掘可以提高 2 型糖尿病的预测和诊断精度。一些研究已经应用机器学习来评估高血压和代谢综合征相关的生物标志物,并改进心血管疾病风险的评估。机器学习方法也被应用于评估肾脏疾病患者的新生物标志物和生存结果,以预测慢性肾脏病的发展、疾病进展和肾脏移植物的存活率。在后一种情况下,发现随机森林方法最适合预测慢性肾脏病。一些研究调查了非酒精性脂肪性肝病和急性肝衰竭的预后,以及病毒疾病患者的治疗反应预测,使用决策树模型。机器学习技术,如稀疏高阶交互模型与拒绝选项,已被用于诊断阿尔茨海默病。数据挖掘技术也被应用于识别严重精神疾病(如抑郁症和痴呆症)的风险因素,并有助于诊断和预测此类患者的生活质量。在儿童健康方面,一些研究确定了预测肥胖和营养不良的最佳算法。机器学习确定了早产和低出生体重的重要风险因素。关于癌症和细菌疾病患者的已发表研究有限,未来的研究可能需要更全面地解决这些问题。在此,我们深入回顾了使用机器学习方法分析生化生物标志物数据以评估几种常见疾病风险的研究,以总结数据挖掘方法在临床诊断中的潜在应用。数据挖掘技术现已越来越多地应用于临床诊断,它们有可能支持这一领域。