Suppr超能文献

用于中国人群神经认知障碍数字筛查的贝叶斯网络模型:开发与验证研究

A bayesian network model for neurocognitive disorders digital screening in Chinese population: development and validation study.

作者信息

Yu Yifan, Zhang Shuaijie, Li Hongkai, Xue Fuzhong

机构信息

Shandong Mental Health Center, Jinan, Shandong Province, People's Republic of China.

Department of Epidemiology and Health Statistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Jinan, People's Republic of China.

出版信息

BMC Psychiatry. 2025 Aug 4;25(1):760. doi: 10.1186/s12888-025-07189-1.

Abstract

BACKGROUND

Neurocognitive disorders (NCDs), classified under the ICD-10 codes F00-F09, are a category of mental disorders associated with brain disease, injury, or systemic conditions leading to cerebral dysfunction. NCDs represent a significant disease burden and an increasingly critical global public health challenge. Early screening for neurocognitive disorders is conducive to improving patients' quality of life and reducing healthcare costs. Therefore, there is an urgent need to develop an inexpensive and convenient screening model for neurocognitive disorders that can be applied to large populations to improve the efficiency of neurocognitive disorders screening.

METHODS

This study aimed to construct a classification model for screening neurocognitive disorders (NCDs) based on cross-sectional electronic health record data from the Cheeloo Whole Lifecycle eHealth Research-based Database (2015-2017). Eligible participants were adults aged 18 years or older, without prior diagnosis of neurocognitive disorders at baseline, covering multiple cities in Shandong Province, China. Among 1,626,817 individuals initially screened, 4,518 diagnosed NCD cases were included for model building and validation. Participants were assigned to a training set or a validation set based on their geographic locations. A Bayesian network classification model was developed by initially screening variables through univariate logistic regression. Gender and the top 30 variables with the highest coefficient of determination () in explaining the variance in NCD status were retained for model construction. Subsequently, the optimal network structure was identified using the Tabu search algorithm guided by Bayesian Information Criterion, with parameters estimated by maximum likelihood estimation. The model's performance was benchmarked against a multivariable logistic regression model. The model's performance was validated through ROC curves, calibration curves, and decision curves analysis. Sensitivity analyses were performed by introducing random missingness into the dataset to evaluate robustness of Bayesian network model and multivariable logistic regression model.

RESULTS

The final Bayesian network model included 31 variables in total, of which eight were directly connected to the neurocognitive disorders node in the learned Bayesian network structure. The Bayesian network model had good predictive discrimination, with AUC of 0.849 (95% CI; 0.839-0.859), 0.821 (95% CI; 0.803-0.840) and 0.800 (95% CI; 0.785-0.815) in the training, testing and validation sets, respectively. The calibration curves were well calibrated, and the decision curve analysis demonstrated its clinical applicability. In sensitivity analysis, the AUC of the Bayesian network model was 0.791 (95% CI; 0.777-0.806), with good robustness to missing data.

CONCLUSIONS

The findings of this study indicated that the established Bayesian network model could identify factors directly related to neurocognitive disorders and accurately predicted the risk of neurocognitive disorders in primary healthcare settings. The Bayesian network model is applicable to screening for neurocognitive disorders in large-scale electronic health record systems among adult populations. The proposed Bayesian network model incorporates 31 variables spanning demographic and clinical variables and demonstrates robustness to missing data, supporting its potential utility in clinical decision-making contexts.

摘要

背景

神经认知障碍(NCDs)归类于国际疾病分类第十版(ICD - 10)编码F00 - F09,是一类与脑部疾病、损伤或导致脑功能障碍的全身性疾病相关的精神障碍。神经认知障碍构成了重大的疾病负担,且是一个日益严峻的全球公共卫生挑战。早期筛查神经认知障碍有助于提高患者生活质量并降低医疗成本。因此,迫切需要开发一种廉价且便捷的神经认知障碍筛查模型,该模型可应用于大规模人群,以提高神经认知障碍筛查的效率。

方法

本研究旨在基于齐鲁全生命周期电子健康研究数据库(2015 - 2017年)的横断面电子健康记录数据构建一个用于筛查神经认知障碍(NCDs)的分类模型。符合条件的参与者为18岁及以上的成年人,基线时未被先前诊断为神经认知障碍,涵盖中国山东省的多个城市。在最初筛查的1,626,817人中,4,518例确诊为NCD的病例被纳入模型构建和验证。参与者根据地理位置被分配到训练集或验证集。通过单变量逻辑回归初步筛选变量,从而开发出贝叶斯网络分类模型。保留性别以及在解释NCD状态差异方面决定系数()最高的前30个变量用于模型构建。随后,使用由贝叶斯信息准则引导的禁忌搜索算法确定最优网络结构,参数通过最大似然估计进行估计。该模型的性能以多变量逻辑回归模型为基准进行评估。通过ROC曲线、校准曲线和决策曲线分析对模型性能进行验证。通过在数据集中引入随机缺失值进行敏感性分析,以评估贝叶斯网络模型和多变量逻辑回归模型的稳健性。

结果

最终的贝叶斯网络模型总共包含31个变量,其中8个在学习到的贝叶斯网络结构中直接与神经认知障碍节点相连。贝叶斯网络模型具有良好的预测辨别力,在训练集、测试集和验证集中的AUC分别为0.849(95%CI;0.839 - 0.859)、0.821(95%CI;0.803 - 0.840)和0.800(95%CI;0.785 - 0.815)。校准曲线校准良好,决策曲线分析证明了其临床适用性。在敏感性分析中,贝叶斯网络模型的AUC为0.791(95%CI;0.777 - 0.806),对缺失数据具有良好的稳健性。

结论

本研究结果表明,所建立的贝叶斯网络模型能够识别与神经认知障碍直接相关的因素,并在基层医疗环境中准确预测神经认知障碍的风险。贝叶斯网络模型适用于在大规模电子健康记录系统中对成年人群进行神经认知障碍筛查。所提出的贝叶斯网络模型纳入了31个涵盖人口统计学和临床变量的变量,并展示了对缺失数据的稳健性,支持其在临床决策背景下的潜在效用。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验