Van Calster B, Timmerman D, Lu C, Suykens J A K, Valentin L, Van Holsbeke C, Amant F, Vergote I, Van Huffel S
Department of Electrical Engineering (ESAT-SCD), Katholieke Universiteit Leuven, and Department of Obstetrics and Gynecology, University Hospitals K. U. Leuven, Belgium.
Ultrasound Obstet Gynecol. 2007 May;29(5):496-504. doi: 10.1002/uog.3996.
To develop flexible classifiers that predict malignancy in adnexal masses using a large database from nine centers.
The database consisted of 1066 patients with at least one persistent adnexal mass for which a large amount of clinical and ultrasound data were recorded. The outcome of interest was the histological classification of the adnexal mass as benign or malignant. The outcome was predicted using Bayesian least squares support vector machines in comparison with relevance vector machines. The models were developed on a training set (n=754) and tested on a test set (n=312).
Twenty-five percent of the patients (n=266) had a malignant tumor. Variable selection resulted in a set of 12 variables for the models: age, maximal diameter of the ovary, maximal diameter of the solid component, personal history of ovarian cancer, hormonal therapy, very strong intratumoral blood flow (i.e. color score 4), ascites, presumed ovarian origin of tumor, multilocular-solid tumor, blood flow within papillary projections, irregular internal cyst wall and acoustic shadows. Test set area under the receiver-operating characteristics curve (AUC) for all models exceeded 0.940, with a sensitivity above 90% and a specificity above 80% for all models. The least squares support vector machine model with linear kernel performed very well, with an AUC of 0.946, 91% sensitivity and 84% specificity. The models performed well in the test sets of all the centers.
Bayesian kernel-based methods can accurately separate malignant from benign masses. The robustness of the models will be investigated in future studies.
利用来自九个中心的大型数据库开发能够预测附件包块恶性程度的灵活分类器。
该数据库包含1066例患者,这些患者至少有一个持续存在的附件包块,并记录了大量临床和超声数据。感兴趣的结果是附件包块的组织学分类为良性或恶性。使用贝叶斯最小二乘支持向量机与相关向量机比较来预测结果。模型在训练集(n = 754)上开发,并在测试集(n = 312)上进行测试。
25%的患者(n = 266)患有恶性肿瘤。变量选择为模型产生了一组12个变量:年龄、卵巢最大直径、实性成分最大直径、卵巢癌个人史、激素治疗、肿瘤内血流非常丰富(即彩色评分4)、腹水、肿瘤推测的卵巢起源、多房实性肿瘤、乳头状突起内的血流、内部囊肿壁不规则和声学阴影。所有模型在受试者操作特征曲线(AUC)下的测试集面积超过0.940,所有模型的敏感性高于90%,特异性高于80%。具有线性核的最小二乘支持向量机模型表现非常好,AUC为0.946,敏感性为91%,特异性为84%。这些模型在所有中心的测试集中表现良好。
基于贝叶斯核的方法能够准确区分恶性和良性包块。模型的稳健性将在未来研究中进行探究。