Bekhuis Tanja, Demner-Fushman Dina
Center for Dental Informatics, School of Dental Medicine, Department of Biomedical Informatics, School of Medicine, University of Pittsburgh, PA, USA.
Stud Health Technol Inform. 2010;160(Pt 1):146-50.
Systematic review authors synthesize research to guide clinicians in their practice of evidence-based medicine. Teammates independently identify provisionally eligible studies by reading the same set of hundreds and sometimes thousands of citations during an initial screening phase. We investigated whether supervised machine learning methods can potentially reduce their workload. We also extended earlier research by including observational studies of a rare condition. To build training and test sets, we used annotated citations from a search conducted for an in-progress Cochrane systematic review. We extracted features from titles, abstracts, and metadata, then trained, optimized, and tested several classifiers with respect to mean performance based on 10-fold cross-validations. In the training condition, the evolutionary support vector machine (EvoSVM) with an Epanechnikov or radial kernel is the best classifier: mean recall=100%; mean precision=48% and 41%, respectively. In the test condition, EvoSVM performance degrades: mean recall=77%, mean precision ranges from 26% to 37%. Because near-perfect recall is essential in this context, we conclude that supervised machine learning methods may be useful for reducing workload under certain conditions.
系统评价的作者综合研究成果,以指导临床医生开展循证医学实践。在初始筛选阶段,团队成员通过阅读同一组有时多达数千篇的数百篇引文,独立识别出初步符合条件的研究。我们调查了监督机器学习方法是否有可能减轻他们的工作量。我们还纳入了一种罕见疾病的观察性研究,从而扩展了早期的研究。为了构建训练集和测试集,我们使用了为一项正在进行的Cochrane系统评价检索得到的带注释的引文。我们从标题、摘要和元数据中提取特征,然后基于10折交叉验证,针对平均性能训练、优化并测试了多个分类器。在训练条件下,采用叶甫根尼科夫核或径向核的进化支持向量机(EvoSVM)是最佳分类器:平均召回率=100%;平均精确率分别为48%和41%。在测试条件下,EvoSVM的性能有所下降:平均召回率=77%,平均精确率在26%至37%之间。由于在这种情况下近乎完美的召回率至关重要,我们得出结论,监督机器学习方法在某些条件下可能有助于减轻工作量。