Suppr超能文献

基于新冠肺炎血液学参数的机器学习预测评分的外部有效性:一项使用巴西、意大利和西欧医院记录的研究。

The external validity of machine learning-based prediction scores from hematological parameters of COVID-19: A study using hospital records from Brazil, Italy, and Western Europe.

作者信息

Safdari Ali, Keshav Chanda Sai, Mody Deepanshu, Verma Kshitij, Kaushal Utsav, Burra Vaadeendra Kumar, Ray Sibnath, Bandyopadhyay Debashree

机构信息

Department of Biological Sciences, Birla Institute of Technology and Science, Pilani, Hyderabad Campus, Hyderabad, Telangana, India.

Gencrest Private Limited, 301-302, B-Wing, Corporate Center, Mumbai, India.

出版信息

PLoS One. 2025 Feb 4;20(2):e0316467. doi: 10.1371/journal.pone.0316467. eCollection 2025.

Abstract

The unprecedented worldwide pandemic caused by COVID-19 has motivated several research groups to develop machine-learning based approaches that aim to automate the diagnosis or screening of COVID-19, in large-scale. The gold standard for COVID-19 detection, quantitative-Real-Time-Polymerase-Chain-Reaction (qRT-PCR), is expensive and time-consuming. Alternatively, haematology-based detections were fast and near-accurate, although those were less explored. The external-validity of the haematology-based COVID-19-predictions on diverse populations are yet to be fully investigated. Here we report external-validity of machine learning-based prediction scores from haematological parameters recorded in different hospitals of Brazil, Italy, and Western Europe (raw sample size, 195554). The XGBoost classifier performed consistently better (out of seven ML classifiers) on all the datasets. The working models include a set of either four or fourteen haematological parameters. The internal performances of the XGBoost models (AUC scores range from 84% to 97%) were superior to ML models reported in the literature for some of these datasets (AUC scores range from 84% to 87%). The meta-validation on the external performances revealed the reliability of the performance (AUC score 86%) along with good accuracy of the probabilistic prediction (Brier score 14%), particularly when the model was trained and tested on fourteen haematological parameters from the same country (Brazil). The external performance was reduced when the model was trained on datasets from Italy and tested on Brazil (AUC score 69%) and Western Europe (AUC score 65%); presumably affected by factors, like, ethnicity, phenotype, immunity, reference ranges, across the populations. The state-of-the-art in the present study is the development of a COVID-19 prediction tool that is reliable and parsimonious, using a fewer number of hematological features, in comparison to the earlier study with meta-validation, based on sufficient sample size (n = 195554). Thus, current models can be applied at other demographic locations, preferably, with prior training of the model on the same population. Availability: https://covipred.bits-hyderabad.ac.in/home; https://github.com/debashreebanerjee/CoviPred.

摘要

由新冠病毒(COVID-19)引发的前所未有的全球大流行促使多个研究团队开发基于机器学习的方法,旨在大规模自动化新冠病毒的诊断或筛查。新冠病毒检测的金标准——定量实时聚合酶链反应(qRT-PCR)既昂贵又耗时。另外,基于血液学的检测快速且近乎准确,尽管对其研究较少。基于血液学的新冠病毒预测在不同人群中的外部有效性尚未得到充分研究。在此,我们报告了基于巴西、意大利和西欧不同医院记录的血液学参数的机器学习预测分数的外部有效性(原始样本量为195554)。在所有数据集中,XGBoost分类器(在七个机器学习分类器中)表现始终更好。工作模型包括一组四个或十四个血液学参数。XGBoost模型的内部性能(AUC分数范围为84%至97%)优于文献中针对其中一些数据集报道的机器学习模型(AUC分数范围为84%至87%)。对外部性能的元验证揭示了性能的可靠性(AUC分数86%)以及概率预测的良好准确性(布里尔分数14%),特别是当模型在来自同一国家(巴西)的十四个血液学参数上进行训练和测试时。当模型在意大利的数据集上进行训练并在巴西(AUC分数69%)和西欧(AUC分数65%)进行测试时,外部性能下降;这可能受到不同人群中种族、表型、免疫力、参考范围等因素的影响。与早期具有元验证的研究相比,本研究的前沿成果是开发了一种可靠且简洁的新冠病毒预测工具,该工具使用较少的血液学特征,并基于足够的样本量(n = 195554)。因此,当前模型可以应用于其他人口统计地区,最好是在对同一人群进行模型的预先训练之后。可用性:https://covipred.bits - hyderabad.ac.in/home;https://github.com/debashreebanerjee/CoviPred。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0066/11793750/4269e489bef0/pone.0316467.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验