Suppr超能文献

应用机器学习算法预测甲状腺疾病风险:一项实验性比较研究。

Application of machine learning algorithms to predict the thyroid disease risk: an experimental comparative study.

作者信息

Islam Saima Sharleen, Haque Md Samiul, Miah M Saef Ullah, Sarwar Talha Bin, Nugraha Ramdhan

机构信息

Department of Computer Science, Faculty of Science and Technology, American International University - Bangladesh (AIUB), Dhaka, Bangladesh.

Faculty of Computing, College of Computing and Applied Sciences, Universiti Malaysia Pahang, Pekan, Pahang, Malaysia.

出版信息

PeerJ Comput Sci. 2022 Mar 3;8:e898. doi: 10.7717/peerj-cs.898. eCollection 2022.

Abstract

Thyroid disease is the general concept for a medical problem that prevents one's thyroid from producing enough hormones. Thyroid disease can affect everyone-men, women, children, adolescents, and the elderly. Thyroid disorders are detected by blood tests, which are notoriously difficult to interpret due to the enormous amount of data necessary to forecast results. For this reason, this study compares eleven machine learning algorithms to determine which one produces the best accuracy for predicting thyroid risk accurately. This study utilizes the Sick-euthyroid dataset, acquired from the University of California, Irvine's machine learning repository, for this purpose. Since the target variable classes in this dataset are mostly one, the accuracy score does not accurately indicate the prediction outcome. Thus, the evaluation metric contains accuracy and recall ratings. Additionally, the F1-score produces a single value that balances the precision and recall when an uneven distribution class exists. Finally, the F1-score is utilized to evaluate the performance of the employed machine learning algorithms as it is one of the most effective output measurements for unbalanced classification problems. The experiment shows that the ANN Classifier with an F1-score of 0.957 outperforms the other nine algorithms in terms of accuracy.

摘要

甲状腺疾病是一个医学问题的统称,指甲状腺无法产生足够的激素。甲状腺疾病可影响所有人,包括男性、女性、儿童、青少年和老年人。甲状腺疾病通过血液检测来诊断,由于预测结果需要大量数据,这些检测结果 notoriously difficult to interpret(难以解读)。因此,本研究比较了11种机器学习算法,以确定哪种算法在准确预测甲状腺风险方面具有最高的准确率。本研究为此使用了从加利福尼亚大学欧文分校机器学习库获取的甲状腺疾病数据集。由于该数据集中的目标变量类别大多为单一类别,准确率得分并不能准确表明预测结果。因此,评估指标包括准确率和召回率评级。此外,当存在不均衡分布类别时,F1分数会产生一个平衡精确率和召回率的单一值。最后,F1分数被用来评估所采用的机器学习算法的性能,因为它是不平衡分类问题最有效的输出度量之一。实验表明,F1分数为0.957的人工神经网络分类器在准确率方面优于其他九种算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0682/9044232/879a628768c2/peerj-cs-08-898-g001.jpg

相似文献

引用本文的文献

本文引用的文献

1
Laboratory interference in the thyroid function test.实验室干扰甲状腺功能检测。
Endokrynol Pol. 2020;71(6):551-560. doi: 10.5603/EP.a2020.0079.
3
Role of thyroid hormones in craniofacial development.甲状腺激素在颅面发育中的作用。
Nat Rev Endocrinol. 2020 Mar;16(3):147-164. doi: 10.1038/s41574-019-0304-5. Epub 2020 Jan 23.
5
Global epidemiology of hyperthyroidism and hypothyroidism.全球甲状腺功能亢进症和甲状腺功能减退症的流行病学。
Nat Rev Endocrinol. 2018 May;14(5):301-316. doi: 10.1038/nrendo.2018.18. Epub 2018 Mar 23.
6
Decision tree methods: applications for classification and prediction.决策树方法:分类与预测应用
Shanghai Arch Psychiatry. 2015 Apr 25;27(2):130-5. doi: 10.11919/j.issn.1002-0829.215044.
9
Fuzzy and hard clustering analysis for thyroid disease.甲状腺疾病的模糊和硬聚类分析。
Comput Methods Programs Biomed. 2013 Jul;111(1):1-16. doi: 10.1016/j.cmpb.2013.01.002. Epub 2013 Jan 26.
10
Exploratory undersampling for class-imbalance learning.用于类别不平衡学习的探索性欠采样
IEEE Trans Syst Man Cybern B Cybern. 2009 Apr;39(2):539-50. doi: 10.1109/TSMCB.2008.2007853. Epub 2008 Dec 16.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验