• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Effect of finite sample size on feature selection and classification: a simulation study.有限样本大小对特征选择和分类的影响:一项模拟研究。
Med Phys. 2010 Feb;37(2):907-20. doi: 10.1118/1.3284974.
2
Classifier design for computer-aided diagnosis: effects of finite sample size on the mean performance of classical and neural network classifiers.用于计算机辅助诊断的分类器设计:有限样本量对经典分类器和神经网络分类器平均性能的影响。
Med Phys. 1999 Dec;26(12):2654-68. doi: 10.1118/1.598805.
3
Feature selection and classifier performance in computer-aided diagnosis: the effect of finite sample size.计算机辅助诊断中的特征选择与分类器性能:有限样本量的影响。
Med Phys. 2000 Jul;27(7):1509-22. doi: 10.1118/1.599017.
4
Classifier performance prediction for computer-aided diagnosis using a limited dataset.使用有限数据集对计算机辅助诊断的分类器性能进行预测。
Med Phys. 2008 Apr;35(4):1559-70. doi: 10.1118/1.2868757.
5
Feature extraction and pattern classification of colorectal polyps in colonoscopic imaging.结肠镜成像中大肠息肉的特征提取与模式分类
Comput Med Imaging Graph. 2014 Jun;38(4):267-75. doi: 10.1016/j.compmedimag.2013.12.009. Epub 2014 Jan 2.
6
Computer-aided detection of lung nodules: false positive reduction using a 3D gradient field method and 3D ellipsoid fitting.肺结节的计算机辅助检测:使用三维梯度场方法和三维椭球体拟合减少假阳性
Med Phys. 2005 Aug;32(8):2443-54. doi: 10.1118/1.1944667.
7
Computer aided detection of clusters of microcalcifications on full field digital mammograms.全视野数字化乳腺钼靶片上微钙化簇的计算机辅助检测
Med Phys. 2006 Aug;33(8):2975-88. doi: 10.1118/1.2211710.
8
Computer aided characterization of the solitary pulmonary nodule using volumetric and contrast enhancement features.利用容积和对比增强特征对孤立性肺结节进行计算机辅助特征描述。
Acad Radiol. 2005 Oct;12(10):1310-9. doi: 10.1016/j.acra.2005.06.005.
9
Computerized analysis of mammographic microcalcifications in morphological and texture feature spaces.乳腺钼靶微钙化在形态学和纹理特征空间中的计算机分析
Med Phys. 1998 Oct;25(10):2007-19. doi: 10.1118/1.598389.
10
Computer-aided diagnosis of pulmonary nodules on CT scans: improvement of classification performance with nodule surface features.CT扫描上肺结节的计算机辅助诊断:利用结节表面特征提高分类性能
Med Phys. 2009 Jul;36(7):3086-98. doi: 10.1118/1.3140589.

引用本文的文献

1
Integrating Rapid Evaporative Ionization Mass Spectrometry Classification with Matrix-Assisted Laser Desorption Ionization Mass Spectrometry Imaging and Liquid Chromatography-Tandem Mass Spectrometry to Unveil Glioblastoma Overall Survival Prediction.将快速蒸发电离质谱分类与基质辅助激光解吸电离质谱成像及液相色谱-串联质谱相结合以揭示胶质母细胞瘤的总生存预测
ACS Chem Neurosci. 2025 Mar 19;16(6):1021-1033. doi: 10.1021/acschemneuro.4c00463. Epub 2025 Feb 25.
2
Key risk factors of generalized anxiety disorder in adolescents: machine learning study.青少年广泛性焦虑症的关键风险因素:机器学习研究
Front Public Health. 2025 Jan 7;12:1504739. doi: 10.3389/fpubh.2024.1504739. eCollection 2024.
3
Artificial intelligence-based motion tracking in cancer radiotherapy: A review.基于人工智能的癌症放射治疗中的运动跟踪:综述。
J Appl Clin Med Phys. 2024 Nov;25(11):e14500. doi: 10.1002/acm2.14500. Epub 2024 Aug 28.
4
Prediction of the Ki-67 expression level in head and neck squamous cell carcinoma with machine learning-based multiparametric MRI radiomics: a multicenter study.基于机器学习的多参数 MRI 放射组学对头颈部鳞状细胞癌 Ki-67 表达水平的预测:一项多中心研究。
BMC Cancer. 2024 Apr 5;24(1):418. doi: 10.1186/s12885-024-12026-x.
5
Editorial: Computational modelling of cardiovascular hemodynamics and machine learning.社论:心血管血液动力学的计算建模与机器学习
Front Cardiovasc Med. 2024 Feb 22;11:1355843. doi: 10.3389/fcvm.2024.1355843. eCollection 2024.
6
Survival Prediction of Patients with Bladder Cancer after Cystectomy Based on Clinical, Radiomics, and Deep-Learning Descriptors.基于临床、影像组学和深度学习特征的膀胱癌患者膀胱切除术后生存预测
Cancers (Basel). 2023 Sep 1;15(17):4372. doi: 10.3390/cancers15174372.
7
Machine learning for detecting Wilson's disease by amplitude of low-frequency fluctuation.基于低频波动幅度的机器学习用于检测威尔逊氏病
Heliyon. 2023 Jul 7;9(7):e18087. doi: 10.1016/j.heliyon.2023.e18087. eCollection 2023 Jul.
8
EEG-Driven Prediction Model of Oxcarbazepine Treatment Outcomes in Patients With Newly-Diagnosed Focal Epilepsy.新诊断局灶性癫痫患者奥卡西平治疗结局的脑电图驱动预测模型
Front Med (Lausanne). 2022 Jan 3;8:781937. doi: 10.3389/fmed.2021.781937. eCollection 2021.
9
Machine Learning-Based Radiomics in Neuro-Oncology.基于机器学习的神经肿瘤放射组学。
Acta Neurochir Suppl. 2022;134:139-151. doi: 10.1007/978-3-030-85292-4_18.
10
Neural Tracking of Sound Rhythms Correlates With Diagnosis, Severity, and Prognosis of Disorders of Consciousness.声音节律的神经追踪与意识障碍的诊断、严重程度及预后相关。
Front Neurosci. 2021 Apr 28;15:646543. doi: 10.3389/fnins.2021.646543. eCollection 2021.

本文引用的文献

1
Classifier performance prediction for computer-aided diagnosis using a limited dataset.使用有限数据集对计算机辅助诊断的分类器性能进行预测。
Med Phys. 2008 Apr;35(4):1559-70. doi: 10.1118/1.2868757.
2
Support vector machines for histogram-based image classification.用于基于直方图的图像分类的支持向量机。
IEEE Trans Neural Netw. 1999;10(5):1055-64. doi: 10.1109/72.788646.
3
Classifier performance estimation under the constraint of a finite sample size: resampling schemes applied to neural network classifiers.有限样本量约束下的分类器性能评估:应用于神经网络分类器的重采样方案
Neural Netw. 2008 Mar-Apr;21(2-3):476-83. doi: 10.1016/j.neunet.2007.12.012. Epub 2007 Dec 17.
4
Computer-aided detection of interstitial abnormalities in chest radiographs using a reference standard based on computed tomography.使用基于计算机断层扫描的参考标准对胸部X光片中的间质异常进行计算机辅助检测。
Med Phys. 2007 Dec;34(12):4798-809. doi: 10.1118/1.2795672.
5
Comparison of typical evaluation methods for computer-aided diagnostic schemes: Monte Carlo simulation study.计算机辅助诊断方案典型评估方法的比较:蒙特卡洛模拟研究
Med Phys. 2007 Mar;34(3):871-6. doi: 10.1118/1.2437130.
6
A fully automated method for lung nodule detection from postero-anterior chest radiographs.一种从后前位胸部X光片中检测肺结节的全自动方法。
IEEE Trans Med Imaging. 2006 Dec;25(12):1588-603. doi: 10.1109/tmi.2006.884198.
7
Computer-aided diagnosis of pulmonary nodules on CT scans: segmentation and classification using 3D active contours.CT扫描上肺结节的计算机辅助诊断:使用三维活动轮廓进行分割和分类
Med Phys. 2006 Jul;33(7):2323-37. doi: 10.1118/1.2207129.
8
What should be expected from feature selection in small-sample settings.在小样本情况下,特征选择应达到什么预期效果。
Bioinformatics. 2006 Oct 1;22(19):2430-6. doi: 10.1093/bioinformatics/btl407. Epub 2006 Jul 26.
9
Recursive SVM feature selection and sample classification for mass-spectrometry and microarray data.用于质谱和微阵列数据的递归支持向量机特征选择与样本分类
BMC Bioinformatics. 2006 Apr 10;7:197. doi: 10.1186/1471-2105-7-197.
10
Analysis and minimization of overtraining effect in rule-based classifiers for computer-aided diagnosis.基于规则的计算机辅助诊断分类器中过度训练效应的分析与最小化
Med Phys. 2006 Feb;33(2):320-8. doi: 10.1118/1.1999126.

有限样本大小对特征选择和分类的影响:一项模拟研究。

Effect of finite sample size on feature selection and classification: a simulation study.

机构信息

Department of Radiology, University of Michigan, Ann Arbor, Michigan 48109-5842, USA.

出版信息

Med Phys. 2010 Feb;37(2):907-20. doi: 10.1118/1.3284974.

DOI:10.1118/1.3284974
PMID:20229900
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2826389/
Abstract

PURPOSE

The small number of samples available for training and testing is often the limiting factor in finding the most effective features and designing an optimal computer-aided diagnosis (CAD) system. Training on a limited set of samples introduces bias and variance in the performance of a CAD system relative to that trained with an infinite sample size. In this work, the authors conducted a simulation study to evaluate the performances of various combinations of classifiers and feature selection techniques and their dependence on the class distribution, dimensionality, and the training sample size. The understanding of these relationships will facilitate development of effective CAD systems under the constraint of limited available samples.

METHODS

Three feature selection techniques, the stepwise feature selection (SFS), sequential floating forward search (SFFS), and principal component analysis (PCA), and two commonly used classifiers, Fisher's linear discriminant analysis (LDA) and support vector machine (SVM), were investigated. Samples were drawn from multidimensional feature spaces of multivariate Gaussian distributions with equal or unequal covariance matrices and unequal means, and with equal covariance matrices and unequal means estimated from a clinical data set. Classifier performance was quantified by the area under the receiver operating characteristic curve Az. The mean Az values obtained by resubstitution and hold-out methods were evaluated for training sample sizes ranging from 15 to 100 per class. The number of simulated features available for selection was chosen to be 50, 100, and 200.

RESULTS

It was found that the relative performance of the different combinations of classifier and feature selection method depends on the feature space distributions, the dimensionality, and the available training sample sizes. The LDA and SVM with radial kernel performed similarly for most of the conditions evaluated in this study, although the SVM classifier showed a slightly higher hold-out performance than LDA for some conditions and vice versa for other conditions. PCA was comparable to or better than SFS and SFFS for LDA at small samples sizes, but inferior for SVM with polynomial kernel. For the class distributions simulated from clinical data, PCA did not show advantages over the other two feature selection methods. Under this condition, the SVM with radial kernel performed better than the LDA when few training samples were available, while LDA performed better when a large number of training samples were available.

CONCLUSIONS

None of the investigated feature selection-classifier combinations provided consistently superior performance under the studied conditions for different sample sizes and feature space distributions. In general, the SFFS method was comparable to the SFS method while PCA may have an advantage for Gaussian feature spaces with unequal covariance matrices. The performance of the SVM with radial kernel was better than, or comparable to, that of the SVM with polynomial kernel under most conditions studied.

摘要

目的

在寻找最有效的特征并设计最佳的计算机辅助诊断(CAD)系统时,可用的训练和测试样本数量很少通常是一个限制因素。在有限的样本集上进行训练会导致 CAD 系统的性能相对于使用无限样本大小进行训练的性能产生偏差和方差。在这项工作中,作者进行了一项模拟研究,以评估各种分类器和特征选择技术的组合及其对类分布、维度和训练样本大小的依赖性。对这些关系的理解将有助于在可用样本有限的情况下开发有效的 CAD 系统。

方法

研究了三种特征选择技术,即逐步特征选择(SFS)、顺序浮动正向搜索(SFFS)和主成分分析(PCA),以及两种常用的分类器,Fisher 线性判别分析(LDA)和支持向量机(SVM)。从具有相等或不相等协方差矩阵和不相等均值的多元高斯分布的多维特征空间中以及从临床数据集估计的具有相等协方差矩阵和不相等均值的多维高斯分布中抽取样本。通过接收器工作特性曲线下的面积 Az 来量化分类器的性能。通过替换和保留方法获得的平均 Az 值用于评估每个类 15 到 100 个训练样本的大小。选择用于选择的模拟特征数量为 50、100 和 200。

结果

发现不同分类器和特征选择方法组合的相对性能取决于特征空间分布、维度和可用的训练样本大小。在本研究评估的大多数条件下,LDA 和具有径向核的 SVM 表现相似,尽管 SVM 分类器在某些条件下的保留性能略高于 LDA,而在其他条件下则相反。对于小样本大小,PCA 与 SFS 和 SFFS 相比,对于 LDA 表现更好,但对于多项式核的 SVM 则表现较差。对于从临床数据模拟的类分布,PCA 并没有显示出优于其他两种特征选择方法的优势。在这种情况下,当可用的训练样本较少时,具有径向核的 SVM 表现优于 LDA,而当有大量训练样本时,LDA 表现更好。

结论

在所研究的不同样本大小和特征空间分布条件下,没有一种所调查的特征选择-分类器组合始终表现出优越的性能。一般来说,SFFS 方法与 SFS 方法相当,而对于具有不相等协方差矩阵的高斯特征空间,PCA 可能具有优势。在大多数研究条件下,具有径向核的 SVM 的性能优于或与具有多项式核的 SVM 的性能相当。