Bristol Medical School, University of Bristol, Bristol, BS8 2PS, UK.
UCL Social Research Institute, University College London, London, WC1H 0AL, UK.
F1000Res. 2020 Mar 25;9:210. doi: 10.12688/f1000research.22781.2. eCollection 2020.
Researchers in evidence-based medicine cannot keep up with the amounts of both old and newly published primary research articles. Support for the early stages of the systematic review process - searching and screening studies for eligibility - is necessary because it is currently impossible to search for relevant research with precision. Better automated data extraction may not only facilitate the stage of review traditionally labelled 'data extraction', but also change earlier phases of the review process by making it possible to identify relevant research. Exponential improvements in computational processing speed and data storage are fostering the development of data mining models and algorithms. This, in combination with quicker pathways to publication, led to a large landscape of tools and methods for data mining and extraction. To review published methods and tools for data extraction to (semi)automate the systematic reviewing process. We propose to conduct a living review. With this methodology we aim to do constant evidence surveillance, bi-monthly search updates, as well as review updates every 6 months if new evidence permits it. In a cross-sectional analysis we will extract methodological characteristics and assess the quality of reporting in our included papers. We aim to increase transparency in the reporting and assessment of automation technologies to the benefit of data scientists, systematic reviewers and funders of health research. This living review will help to reduce duplicate efforts by data scientists who develop data mining methods. It will also serve to inform systematic reviewers about possibilities to support their data extraction.
循证医学的研究人员无法跟上旧的和新发表的主要研究文章的数量。对系统评价过程早期阶段(搜索和筛选研究的资格)的支持是必要的,因为目前不可能精确地搜索相关研究。更好的自动化数据提取不仅可以促进传统上称为“数据提取”的审查阶段,而且还可以通过识别相关研究来改变审查过程的早期阶段。计算处理速度和数据存储的指数级改进促进了数据挖掘模型和算法的发展。这与更快的出版途径相结合,为数据挖掘和提取工具和方法提供了广阔的前景。 旨在审查已发表的用于(半自动)自动化系统评价过程的数据提取方法和工具。 我们建议进行实时审查。使用这种方法,我们的目标是进行持续的证据监测,每两个月进行一次搜索更新,如果有新证据,每 6 个月进行一次审查更新。在横断面分析中,我们将提取方法学特征,并评估纳入研究报告的质量。 我们旨在提高报告和评估自动化技术的透明度,使数据科学家、系统评价者和卫生研究资助者受益。这种实时审查将有助于减少数据科学家开发数据挖掘方法的重复工作。它还将为系统评价者提供支持其数据提取的可能性。