School of Big Data and Information Industry, Chongqing City Management College, Chongqing 401331, China.
Information Center, Chongqing Medical University, Chongqing 400016, China.
Comput Methods Programs Biomed. 2024 Aug;253:108255. doi: 10.1016/j.cmpb.2024.108255. Epub 2024 May 28.
Stroke has become a major disease threatening the health of people around the world. It has the characteristics of high incidence, high fatality, and a high recurrence rate. At this stage, problems such as poor recognition accuracy of stroke screening based on electronic medical records and insufficient recognition of stroke risk levels exist. These problems occur because of the systematic errors of medical equipment and the characteristics of the collectors during the process of electronic medical record collection. Errors can also occur due to misreporting or underreporting by the collection personnel and the strong subjectivity of the evaluation indicators.
This paper proposes an isolation forest-voting fusion-multioutput algorithm model. First, the screening data are collected for numerical processing and normalization. The composite feature score index of this paper is used to analyze the importance of risk factors, and then, the isolation forest is used. The algorithm detects abnormal samples, uses the voting fusion algorithm proposed in this article to perform decision fusion prediction classification, and outputs multidimensional (risk factor importance score, abnormal sample label, risk level classification, and stroke prediction) results that can be used as auxiliary decision information by doctors and medical staff.
The isolation forest-voting fusion-multioutput algorithm proposed in this article has five categories (zero risk, low risk, high risk, ischemic stroke (TIA), and hemorrhagic stroke (HE)). The average accuracy rate of stroke prediction reached 79.59 %.
The isolation forest-voting fusion-multioutput algorithm model proposed in this paper can not only accurately identify the various categories of stroke risk levels and stroke prediction but can also output multidimensional auxiliary decision-making information to help medical staff make decisions, thereby greatly improving the screening efficiency.
脑卒中已成为危害全球人民健康的主要疾病之一,具有发病率高、病死率高、复发率高的特点。现阶段,基于电子病历的脑卒中筛查识别准确率不高、脑卒中风险等级识别不足等问题仍然存在,这是由于医疗设备的系统误差以及电子病历采集过程中采集人员的特点造成的,也可能由于采集人员的误报或漏报以及评价指标的强主观性而产生误差。
本文提出了一种孤立森林投票融合多输出算法模型。首先,对筛查数据进行数值处理和归一化,使用本文提出的综合特征评分指标分析风险因素的重要性,然后使用孤立森林算法检测异常样本,使用本文提出的投票融合算法进行决策融合预测分类,并输出多维(风险因素重要性评分、异常样本标签、风险等级分类和脑卒中预测)结果,可作为医生和医务人员的辅助决策信息。
本文提出的孤立森林投票融合多输出算法具有五类(零风险、低风险、高风险、缺血性脑卒中(TIA)和出血性脑卒中(HE)),脑卒中预测的平均准确率达到 79.59%。
本文提出的孤立森林投票融合多输出算法模型不仅可以准确识别脑卒中风险等级和脑卒中预测的各类别,还可以输出多维辅助决策信息,帮助医务人员做出决策,从而大大提高筛查效率。