Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong Province, China.
Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong Province, China; School of Control Science and Engineering, Shandong University, Jinan, Shandong Province, China.
Comput Methods Programs Biomed. 2022 Jun;221:106842. doi: 10.1016/j.cmpb.2022.106842. Epub 2022 Apr 28.
The identification of carotid plaque, one of the most crucial tasks in stroke screening, is of great significance in the assessment of subclinical atherosclerosis and preventing the onset of stroke. However, traditional ultrasound examination is not prevalent or cost-effective for asymptomatic people, particularly low-income individuals in rural areas. Thus, it is necessary to develop an accurate and explainable model for early identification of the risk of plaque prevalence that can help in the primary prevention of stroke.
We developed an ensemble learning method to predict the occurrence of carotid plaques. A dataset comprising 1440 subjects (50% with plaques and 50% without plaques) and ten-fold cross-validation were utilized to evaluate the model performance. Four machine learning methods (extreme gradient boosting (XGBoost), gradient boosting decision tree, random forest, and support vector machine) were evaluated. Subsequently, the interpretability of the XGBoost model, which provided the best performance, was analyzed from three aspects: feature importance, feature effect on prediction model, and feature effect on prediction decision for a specific subject.
The XGBoost algorithm provided the best performance (sensitivity: 0.8678, specificity: 0.8592, accuracy: 0.8632, F1 score: 0.8621, area under the curve: 0.8635) in carotid plaque prediction and also had excellent performance under missing data circumstances. Further, interpretability analysis showed that the decisions of the XGBoost model were highly congruent with clinical knowledge.
The model results are superior to those of state-of-the-art methods. Thus, it is a promising carotid plaque prediction tool that could be used in the primary prevention of stroke.
颈动脉斑块的识别是中风筛查中最重要的任务之一,对于评估亚临床动脉粥样硬化和预防中风的发生具有重要意义。然而,传统的超声检查对于无症状人群,特别是农村地区的低收入人群,并不普及或具有成本效益。因此,有必要开发一种准确且可解释的模型,用于早期识别斑块发生的风险,从而有助于中风的一级预防。
我们开发了一种集成学习方法来预测颈动脉斑块的发生。利用包含 1440 名受试者(50%有斑块,50%无斑块)和十折交叉验证的数据集来评估模型性能。评估了四种机器学习方法(极端梯度提升(XGBoost)、梯度提升决策树、随机森林和支持向量机)。随后,从特征重要性、特征对预测模型的影响以及特征对特定个体预测决策的影响三个方面对性能最佳的 XGBoost 模型的可解释性进行了分析。
XGBoost 算法在颈动脉斑块预测方面表现最佳(敏感性:0.8678,特异性:0.8592,准确性:0.8632,F1 得分:0.8621,曲线下面积:0.8635),并且在缺失数据情况下也具有出色的性能。进一步的可解释性分析表明,XGBoost 模型的决策与临床知识高度一致。
模型结果优于最先进的方法。因此,这是一种有前途的颈动脉斑块预测工具,可用于中风的一级预防。