Department of Electrical Engineering, National Central University, Taoyuan City, Taiwan.
Department of Medical Humanities and Education, College of Medicine, Kaohsiung Medical University, Kaohsiung, Taiwan.
Eur Psychiatry. 2021 Dec 23;65(1):e1. doi: 10.1192/j.eurpsy.2021.2248.
Support vector machines (SVMs) based on brain-wise functional connectivity (FC) have been widely adopted for single-subject prediction of patients with schizophrenia, but most of them had small sample size. This study aimed to evaluate the performance of SVMs based on a large single-site dataset and investigate the effects of demographic homogeneity and training sample size on classification accuracy.
The resting functional Magnetic Resonance Imaging (fMRI) dataset comprised 220 patients with schizophrenia and 220 healthy controls. Brain-wise FCs was calculated for each participant and linear SVMs were developed for automatic classification of patients and controls. First, we evaluated the SVMs based on all participants and homogeneous subsamples of men, women, younger (18-30 years), and older (31-50 years) participants by 10-fold nested cross-validation. Then, we hold out a fixed test set of 40 participants (20 patients and 20 controls) and evaluated the SVMs based on incremental training sample sizes (N = 40, 80, …, 400).
We found that the SVMs based on all participants had accuracy of 85.05%. The SVMs based on male, female, young, and older participants yielded accuracy of 84.66, 81.56, 80.50, and 86.13%, respectively. Although the SVMs based on older subsamples had better performance than those based on all participants, they generalized poorly to younger participants (77.24%). For incremental training sizes, the classification accuracy increased stepwise from 72.6 to 83.3%, with >80% accuracy achieved with sample size >240.
The findings indicate that SVMs based on a large dataset yield high classification accuracy and establish models using a large sample size with heterogeneous properties are recommended for single subject prediction of schizophrenia.
基于脑区功能连接的支持向量机(SVM)已广泛应用于精神分裂症患者的个体预测,但大多数研究的样本量较小。本研究旨在评估基于大型单站点数据集的 SVM 的性能,并探讨人群同质性和训练样本量对分类准确性的影响。
静息态功能磁共振成像(fMRI)数据集包括 220 名精神分裂症患者和 220 名健康对照者。为每位参与者计算脑区功能连接,然后开发线性 SVM 以实现患者和对照者的自动分类。首先,我们通过 10 折嵌套交叉验证评估了基于所有参与者和同质亚组(男性、女性、年轻组(18-30 岁)和老年组(31-50 岁))的 SVM。然后,我们保留 40 名参与者(20 名患者和 20 名对照者)的固定测试集,并基于递增的训练样本量(N=40、80、……、400)评估 SVM。
我们发现,基于所有参与者的 SVM 准确性为 85.05%。基于男性、女性、年轻和老年参与者的 SVM 准确性分别为 84.66%、81.56%、80.50%和 86.13%。虽然基于老年亚组的 SVM 性能优于基于所有参与者的 SVM,但对年轻参与者的泛化能力较差(77.24%)。对于递增的训练规模,分类准确性逐步从 72.6%提高到 83.3%,样本量>240 时可获得>80%的准确性。
研究结果表明,基于大数据集的 SVM 可获得较高的分类准确性,建议使用具有异质特性的大样本量建立模型,以实现精神分裂症的个体预测。