Zhang Jimmy, Song Luo, Miller Zachary, Chan Kwun C G, Huang Kuan-Lin
Department of Genetics and Genomic Sciences, Center for Transformative Disease Modeling, Tisch Cancer Institute, Icahn Institute for Data Science and Genomic Technology, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA.
Columbia University, New York, NY, 10027, USA.
Commun Med (Lond). 2024 Feb 28;4(1):23. doi: 10.1038/s43856-024-00437-7.
Dementia care is challenging due to the divergent trajectories in disease progression and outcomes. Predictive models are needed to flag patients at risk of near-term mortality and identify factors contributing to mortality risk across different dementia types.
Here, we developed machine-learning models predicting dementia patient mortality at four different survival thresholds using a dataset of 45,275 unique participants and 163,782 visit records from the U.S. National Alzheimer's Coordinating Center (NACC). We built multi-factorial XGBoost models using a small set of mortality predictors and conducted stratified analyses with dementiatype-specific models.
Our models achieved an area under the receiver operating characteristic curve (AUC-ROC) of over 0.82 utilizing nine parsimonious features for all 1-, 3-, 5-, and 10-year thresholds. The trained models mainly consisted of dementia-related predictors such as specific neuropsychological tests and were minimally affected by other age-related causes of death, e.g., stroke and cardiovascular conditions. Notably, stratified analyses revealed shared and distinct predictors of mortality across eight dementia types. Unsupervised clustering of mortality predictors grouped vascular dementia with depression and Lewy body dementia with frontotemporal lobar dementia.
This study demonstrates the feasibility of flagging dementia patients at risk of mortality for personalized clinical management. Parsimonious machine-learning models can be used to predict dementia patient mortality with a limited set of clinical features, and dementiatype-specific models can be applied to heterogeneous dementia patient populations.
由于疾病进展和预后的轨迹不同,痴呆症护理具有挑战性。需要预测模型来标记近期有死亡风险的患者,并识别不同痴呆类型中导致死亡风险的因素。
在此,我们使用来自美国国家阿尔茨海默病协调中心(NACC)的45275名独特参与者和163782次就诊记录的数据集,开发了机器学习模型,以预测四个不同生存阈值下的痴呆症患者死亡率。我们使用一小部分死亡率预测因子构建了多因素XGBoost模型,并对特定痴呆类型的模型进行了分层分析。
我们的模型在所有1年、3年、5年和10年阈值下,利用9个简约特征实现了受试者工作特征曲线下面积(AUC-ROC)超过0.82。训练后的模型主要由与痴呆症相关的预测因子组成,如特定的神经心理学测试,并且受其他与年龄相关的死亡原因(如中风和心血管疾病)的影响最小。值得注意的是,分层分析揭示了八种痴呆类型中共同和不同的死亡率预测因子。死亡率预测因子的无监督聚类将血管性痴呆与抑郁症归为一组,路易体痴呆与额颞叶痴呆归为一组。
本研究证明了标记有死亡风险的痴呆症患者以进行个性化临床管理的可行性。简约的机器学习模型可用于通过有限的临床特征集预测痴呆症患者死亡率,特定痴呆类型的模型可应用于异质性痴呆患者群体。