Cohen Aaron M, Ambert Kyle, McDonagh Marian
Department of Medical Informatics and Clinical Epidemiology, Oregon Health & Science University, Portland, Oregon, USA.
AMIA Annu Symp Proc. 2010 Nov 13;2010:121-5.
Systematic reviews (SR) are an important and labor-intensive part of the Evidence-based Medicine process that could benefit from automated literature classification tools. We conducted a prospective study of a support vector machine-based classifier for supporting the SR literature triage process. Over 50,000 training data samples were collected for 18 topics prior to March 2008, and used to make predictions on 11,000 test data samples collected during the subsequent two years. Test performance (AUC) was comparable to that estimated by cross-validation on the training set, and ranging from 0.75 - 0.99. Mean AUC macro-averaged across all topics was 0.89, demonstrating that these methods can achieve accurate results in near-real world conditions and are promising tools for deployment to groups conducting SRs.
系统评价(SR)是循证医学过程中重要且耗费人力的一部分,可借助自动文献分类工具从中受益。我们针对一种基于支持向量机的分类器开展了一项前瞻性研究,以支持系统评价文献筛选过程。在2008年3月之前,针对18个主题收集了超过50,000个训练数据样本,并用于对随后两年收集的11,000个测试数据样本进行预测。测试性能(AUC)与训练集上交叉验证估计的结果相当,范围在0.75 - 0.99之间。所有主题的平均AUC宏平均值为0.89,表明这些方法在近乎真实的条件下能够取得准确结果,并且是有望部署到开展系统评价的团队的工具。