一项使用公共和私有数据集对深度学习检测α地中海贫血和β地中海贫血进行的综合案例研究。

A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets.

作者信息

Nasir Muhammad Umar, Naseem Muhammad Tahir, Ghazal Taher M, Zubair Muhammad, Ali Oualid, Abbas Sagheer, Ahmad Munir, Adnan Khan Muhammad

机构信息

School of Computing, IVY CMS, Lahore, 54000, Pakistan.

Department of Computer Science, Faculty of Computing, Riphah International University, Islamabad, 45000, Pakistan.

出版信息

Sci Rep. 2025 Apr 17;15(1):13359. doi: 10.1038/s41598-025-97353-0.

DOI:10.1038/s41598-025-97353-0

PMID:40246871

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12006322/

Abstract

This study explores the performance of deep learning models, specifically Convolutional Neural Networks (CNN) and XGBoost, in predicting alpha and beta thalassemia using both public and private datasets. Thalassemia is a genetic disorder that impairs hemoglobin production, leading to anemia and other health complications. Early diagnosis is essential for effective management and prevention of severe health issues. The study applied CNN and XGBoost to two case studies: one for alpha-thalassemia and the other for beta-thalassemia. Public datasets were sourced from medical databases, while private datasets were collected from clinical records, offering a more comprehensive feature set and larger sample sizes. After data preprocessing and splitting, model performance was evaluated. XGBoost achieved 99.34% accuracy on the private dataset for alpha thalassemia, while CNN reached 98.10% accuracy on the private dataset for beta-thalassemia. The superior performance on private datasets was attributed to better data quality and volume. This study highlights the effectiveness of deep learning in medical diagnostics, demonstrating that high-quality data can significantly enhance the predictive capabilities of AI models. By integrating CNN and XGBoost, this approach offers a robust method for detecting thalassemia, potentially improving early diagnosis and reducing disease-related mortality.

摘要

本研究探讨了深度学习模型，特别是卷积神经网络（CNN）和XGBoost，在使用公共和私有数据集预测α和β地中海贫血方面的性能。地中海贫血是一种遗传性疾病，会损害血红蛋白的生成，导致贫血和其他健康并发症。早期诊断对于有效管理和预防严重健康问题至关重要。该研究将CNN和XGBoost应用于两个案例研究：一个用于α地中海贫血，另一个用于β地中海贫血。公共数据集来自医学数据库，而私有数据集则从临床记录中收集，提供了更全面的特征集和更大的样本量。经过数据预处理和拆分后，对模型性能进行了评估。XGBoost在α地中海贫血的私有数据集上达到了99.34%的准确率，而CNN在β地中海贫血的私有数据集上达到了98.10%的准确率。私有数据集上的卓越性能归因于更好的数据质量和数量。本研究强调了深度学习在医学诊断中的有效性，表明高质量数据可以显著提高人工智能模型的预测能力。通过整合CNN和XGBoost，这种方法为检测地中海贫血提供了一种强大的方法，有可能改善早期诊断并降低与疾病相关的死亡率。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d35a/12006322/70d9292bc31c/41598_2025_97353_Fig1_HTML.jpg

相似文献

A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets.一项使用公共和私有数据集对深度学习检测α地中海贫血和β地中海贫血进行的综合案例研究。

Sci Rep. 2025 Apr 17;15(1):13359. doi: 10.1038/s41598-025-97353-0.

Exploring vision transformers and XGBoost as deep learning ensembles for transforming carcinoma recognition.探索将视觉Transformer和XGBoost作为深度学习集成方法用于转化型癌的识别。

Sci Rep. 2024 Dec 3;14(1):30052. doi: 10.1038/s41598-024-81456-1.

Brain tumor segmentation and detection in MRI using convolutional neural networks and VGG16.使用卷积神经网络和VGG16在磁共振成像（MRI）中进行脑肿瘤分割与检测

Cancer Biomark. 2025 Mar;42(3):18758592241311184. doi: 10.1177/18758592241311184. Epub 2025 Apr 4.

Classification of α-thalassemia data using machine learning models.使用机器学习模型对α地中海贫血数据进行分类。

Comput Methods Programs Biomed. 2025 Mar;260:108581. doi: 10.1016/j.cmpb.2024.108581. Epub 2025 Jan 6.

Integrating deep learning and regression models for accurate prediction of groundwater fluoride contamination in old city in Bitlis province, Eastern Anatolia Region, Türkiye.利用深度学习和回归模型准确预测土耳其东安纳托利亚地区比特利斯省老城的地下水氟污染

Environ Sci Pollut Res Int. 2024 Jul;31(34):47201-47219. doi: 10.1007/s11356-024-34194-w. Epub 2024 Jul 11.

AI-driven early diagnosis of specific mental disorders: a comprehensive study.人工智能驱动的特定精神障碍早期诊断：一项综合研究。

Cogn Neurodyn. 2025 Dec;19(1):70. doi: 10.1007/s11571-025-10253-x. Epub 2025 May 5.

Multimodal sentiment analysis leveraging the strength of deep neural networks enhanced by the XGBoost classifier.利用XGBoost分类器增强的深度神经网络优势的多模态情感分析。

Comput Methods Biomech Biomed Engin. 2025 May;28(6):777-799. doi: 10.1080/10255842.2024.2313066. Epub 2024 Feb 10.

[A study on gene mutation spectrums of α- and β-thalassemias in populations of Yunnan Province and the prenatal gene diagnosis].云南省人群α和β地中海贫血基因突变谱及产前基因诊断研究

Zhonghua Fu Chan Ke Za Zhi. 2012 Feb;47(2):85-9.

Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning.用于计算机辅助检测的深度卷积神经网络：卷积神经网络架构、数据集特征与迁移学习

IEEE Trans Med Imaging. 2016 May;35(5):1285-98. doi: 10.1109/TMI.2016.2528162. Epub 2016 Feb 11.

MABAL: a Novel Deep-Learning Architecture for Machine-Assisted Bone Age Labeling.MABAL：一种用于机器辅助骨龄标注的新型深度学习架构。

J Digit Imaging. 2018 Aug;31(4):513-519. doi: 10.1007/s10278-018-0053-3.

引用本文的文献

Multiclass classification of thalassemia types using complete blood count and HPLC data with machine learning.利用全血细胞计数和高效液相色谱数据，通过机器学习对地中海贫血类型进行多类别分类。

Sci Rep. 2025 Jul 21;15(1):26379. doi: 10.1038/s41598-025-06594-6.

本文引用的文献

Enhancing thalassemia gene carrier identification in non-anemic populations using artificial intelligence erythrocyte morphology analysis and machine learning.利用人工智能红细胞形态分析和机器学习技术提高非贫血人群地中海贫血基因携带者的识别率。

Eur J Haematol. 2024 May;112(5):692-700. doi: 10.1111/ejh.14160. Epub 2023 Dec 28.

Prevalence and Genetic Analysis of Thalassemia and Hemoglobinopathy in Different Ethnic Groups and Regions in Hainan Island, Southeast China.中国东南部海南岛不同民族和地区地中海贫血及血红蛋白病的患病率与基因分析

Front Genet. 2022 Jun 13;13:874624. doi: 10.3389/fgene.2022.874624. eCollection 2022.

[The value of combined detection of HbA2 and HbF for the screening of thalassemia among individuals of childbearing ages].[血红蛋白A2与血红蛋白F联合检测在育龄人群地中海贫血筛查中的价值]

Zhonghua Yi Xue Yi Chuan Xue Za Zhi. 2022 Jan 10;39(1):16-20.

Screening of Some Indicators for Alpha-Thalassemia in Fujian Province of Southern China.中国南方福建省α地中海贫血若干指标的筛查

Int J Gen Med. 2021 Oct 28;14:7329-7335. doi: 10.2147/IJGM.S338419. eCollection 2021.

Molecular Characterization of α- and β-Thalassaemia Among Children From 1 to 10 Years of Age in Guangxi, A Multi-Ethnic Region in Southern China.中国南方多民族地区广西1至10岁儿童α和β地中海贫血的分子特征分析

Front Pediatr. 2021 Aug 23;9:724196. doi: 10.3389/fped.2021.724196. eCollection 2021.

The Evolving Role of Next-Generation Sequencing in Screening and Diagnosis of Hemoglobinopathies.下一代测序技术在血红蛋白病筛查与诊断中不断演变的作用。

Front Physiol. 2021 Jul 27;12:686689. doi: 10.3389/fphys.2021.686689. eCollection 2021.

The parental perspective of thalassaemia in Bangladesh: lack of knowledge, regret, and barriers.孟加拉国地中海贫血患者父母的观点：知识匮乏、遗憾与障碍

Orphanet J Rare Dis. 2021 Jul 16;16(1):315. doi: 10.1186/s13023-021-01947-6.

Prevalence and molecular spectrum of α- and β-globin gene mutations in Hainan, China.中国海南地区α-和β-珠蛋白基因突变的流行情况和分子谱。

Int J Hematol. 2021 Sep;114(3):307-318. doi: 10.1007/s12185-021-03173-z. Epub 2021 Jun 30.

Mutation spectrum and erythrocyte indices characterisation of α-thalassaemia and β-thalassaemia in Sichuan women in China: a thalassaemia screening survey of 42 155 women.中国四川女性α-和β-地中海贫血的突变谱及红细胞指数特征：42155 例女性地中海贫血筛查调查。

J Clin Pathol. 2021 Mar;74(3):182-186. doi: 10.1136/jclinpath-2020-206588. Epub 2020 Jul 27.

Update in Laboratory Diagnosis of Thalassemia.地中海贫血实验室诊断的最新进展。

Front Mol Biosci. 2020 May 27;7:74. doi: 10.3389/fmolb.2020.00074. eCollection 2020.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一项使用公共和私有数据集对深度学习检测α地中海贫血和β地中海贫血进行的综合案例研究。

A comprehensive case study of deep learning on the detection of alpha thalassemia and beta thalassemia using public and private datasets.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献