Suppr超能文献

结合图像、语音和患者问卷数据对喉部疾病进行分类。

Combining image, voice, and the patient's questionnaire data to categorize laryngeal disorders.

机构信息

Department of Electrical & Control Equipment, Kaunas University of Technology, Lithuania.

出版信息

Artif Intell Med. 2010 May;49(1):43-50. doi: 10.1016/j.artmed.2010.02.002. Epub 2010 Mar 24.

Abstract

OBJECTIVE

This paper is concerned with soft computing techniques for categorizing laryngeal disorders based on information extracted from an image of patient's vocal folds, a voice signal, and questionnaire data.

METHODS

Multiple feature sets are exploited to characterize images and voice signals. To characterize colour, texture, and geometry of biological structures seen in colour images of vocal folds, eight feature sets are used. Twelve feature sets are used to obtain a comprehensive characterization of a voice signal (the sustained phonation of the vowel sound /a/). Answers to 14 questions constitute the questionnaire feature set. A committee of support vector machines is designed for categorizing the image, voice, and query data represented by the multiple feature sets into the healthy, nodular and diffuse classes. Five alternatives to aggregate separate SVMs into a committee are explored. Feature selection and classifier design are combined into the same learning process based on genetic search.

RESULTS

Data of all the three modalities were available from 240 patients. Among those, 151 patients belong to the nodular class, 64 to the diffuse class and 25 to the healthy class. When using a single feature set to characterize each modality, the test set data classification accuracy of 75.0%, 72.1%, and 85.0% was obtained for the image, voice and questionnaire data, respectively. The use of multiple feature sets allowed to increase the accuracy to 89.5% and 87.7% for the image and voice data, respectively. The test set data classification accuracy of over 98.0% was obtained from a committee exploiting multiple feature sets from all the three modalities. The highest classification accuracy was achieved when using the SVM-based aggregation with hyper parameters of the SVM determined by genetic search. Bearing in mind the difficulty of the task, the obtained classification accuracy is rather encouraging.

CONCLUSIONS

Combination of both multiple feature sets characterizing a single modality and the three modalities allowed to substantially improve the classification accuracy if compared to the highest accuracy obtained from a single feature set and a single modality. In spite of the unbalanced data sets used, the error rates obtained for the three classes were rather similar.

摘要

目的

本文关注基于从患者声带图像、语音信号和问卷数据中提取的信息,应用软计算技术对声带疾病进行分类。

方法

利用多个特征集来描述图像和语音信号。为了描述声带彩色图像中生物结构的颜色、纹理和几何形状,使用了 8 个特征集。为了全面描述语音信号(元音/a/的持续发音),使用了 12 个特征集。问卷特征集由 14 个问题的答案组成。设计了一个支持向量机委员会,用于将多个特征集表示的图像、语音和查询数据分类为健康、结节和弥漫性类别。探索了五种将独立的 SVM 聚合到委员会中的方法。特征选择和分类器设计结合到基于遗传搜索的同一个学习过程中。

结果

共有 240 名患者的三种模态数据可用。其中,151 名患者属于结节类,64 名患者属于弥漫性类,25 名患者属于健康类。当使用单个特征集来描述每种模态时,图像、语音和问卷数据的测试集数据分类准确率分别为 75.0%、72.1%和 85.0%。使用多个特征集可以将图像和语音数据的准确率分别提高到 89.5%和 87.7%。使用来自所有三种模态的多个特征集的委员会可以获得超过 98.0%的测试集数据分类准确率。使用基于遗传搜索确定 SVM 超参数的 SVM 聚合获得了最高的分类准确率。考虑到任务的难度,所获得的分类准确率相当令人鼓舞。

结论

与单个特征集和单个模态获得的最高准确率相比,组合单个模态的多个特征集以及三种模态可以大大提高分类准确率。尽管使用了不平衡的数据集,但三个类别的错误率相当相似。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验