Chen Teng, Polak Paweł, Uryasev Stanislav
Department of Applied Math & Statistics, Stony Brook University, Stony Brook, NY, USA.
J Appl Stat. 2022 May 4;50(11-12):2473-2503. doi: 10.1080/02664763.2022.2064975. eCollection 2023.
Early detection and effective treatment of severe COVID-19 patients remain two major challenges during the current pandemic. Analysis of molecular changes in blood samples of severe patients is one of the promising approaches to this problem. From thousands of proteomic, metabolomic, lipidomic, and transcriptomic biomarkers selected in other research, we identify several that after additional nonlinear spline transformation are highly effective in classifying and predicting severe COVID-19 cases. The performance of these pairs is evaluated in-sample, in a cross-validation exercise, and in an out-of-sample analysis on two independent datasets. We further improve our classifier by identifying complementary pairs using hierarchical clustering. In a result, we achieve 96-98% AUC on the validation data. Our findings can help medical experts to identify small groups of biomarkers that after nonlinear transformation can be used to construct a cost-effective test for patient screening and prediction of severity progression.
在当前疫情期间,重症COVID-19患者的早期检测和有效治疗仍然是两大挑战。分析重症患者血液样本中的分子变化是解决这一问题的有前景的方法之一。从其他研究中筛选出的数千种蛋白质组学、代谢组学、脂质组学和转录组学生物标志物中,我们识别出几种经过额外非线性样条变换后在分类和预测重症COVID-19病例方面非常有效的标志物。这些标志物对的性能在样本内、交叉验证实验以及两个独立数据集的样本外分析中进行评估。我们通过使用层次聚类识别互补对进一步改进了分类器。结果,我们在验证数据上实现了96 - 98%的曲线下面积。我们的发现可以帮助医学专家识别一小部分生物标志物,这些标志物经过非线性变换后可用于构建一种经济高效的检测方法,用于患者筛查和严重程度进展预测。