Suppr超能文献

糖尿病研究中的机器学习与数据挖掘方法

Machine Learning and Data Mining Methods in Diabetes Research.

作者信息

Kavakiotis Ioannis, Tsave Olga, Salifoglou Athanasios, Maglaveras Nicos, Vlahavas Ioannis, Chouvarda Ioanna

机构信息

Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece; Institute of Applied Biosciences, CERTH, Thessaloniki, Greece.

Laboratory of Inorganic Chemistry, Department of Chemical Engineering, Aristotle University of Thessaloniki, Thessaloniki 54124, Greece.

出版信息

Comput Struct Biotechnol J. 2017 Jan 8;15:104-116. doi: 10.1016/j.csbj.2016.12.005. eCollection 2017.

Abstract

The remarkable advances in biotechnology and health sciences have led to a significant production of data, such as high throughput genetic data and clinical information, generated from large Electronic Health Records (EHRs). To this end, application of machine learning and data mining methods in biosciences is presently, more than ever before, vital and indispensable in efforts to transform intelligently all available information into valuable knowledge. Diabetes mellitus (DM) is defined as a group of metabolic disorders exerting significant pressure on human health worldwide. Extensive research in all aspects of diabetes (diagnosis, etiopathophysiology, therapy, etc.) has led to the generation of huge amounts of data. The aim of the present study is to conduct a systematic review of the applications of machine learning, data mining techniques and tools in the field of diabetes research with respect to a) Prediction and Diagnosis, b) Diabetic Complications, c) Genetic Background and Environment, and e) Health Care and Management with the first category appearing to be the most popular. A wide range of machine learning algorithms were employed. In general, 85% of those used were characterized by supervised learning approaches and 15% by unsupervised ones, and more specifically, association rules. Support vector machines (SVM) arise as the most successful and widely used algorithm. Concerning the type of data, clinical datasets were mainly used. The title applications in the selected articles project the usefulness of extracting valuable knowledge leading to new hypotheses targeting deeper understanding and further investigation in DM.

摘要

生物技术和健康科学的显著进步带来了大量数据的产生,例如从大型电子健康记录(EHR)中生成的高通量基因数据和临床信息。为此,目前在生物科学中应用机器学习和数据挖掘方法比以往任何时候都更加重要和不可或缺,这些方法有助于将所有可用信息智能地转化为有价值的知识。糖尿病(DM)被定义为一组对全球人类健康造成巨大压力的代谢紊乱疾病。在糖尿病各个方面(诊断、病因病理生理学、治疗等)的广泛研究导致了大量数据的产生。本研究的目的是对机器学习、数据挖掘技术和工具在糖尿病研究领域的应用进行系统综述,涉及以下方面:a)预测与诊断;b)糖尿病并发症;c)遗传背景与环境;e)医疗保健与管理,其中第一类应用似乎最为普遍。研究采用了多种机器学习算法。总体而言,所使用的算法中85%采用监督学习方法,15%采用无监督学习方法,更具体地说是关联规则。支持向量机(SVM)是最成功且应用最广泛的算法。关于数据类型,主要使用临床数据集。所选文章中的标题应用展示了提取有价值知识的有用性,这些知识能够产生新的假设,从而更深入地理解糖尿病并进行进一步研究。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验