Central Team, Health Data Research UK, London, UK
Central Team, Health Data Research UK, London, UK.
BMJ Health Care Inform. 2021 May;28(1). doi: 10.1136/bmjhci-2020-100303.
The value of healthcare data is being increasingly recognised, including the need to improve health dataset utility. There is no established mechanism for evaluating healthcare dataset utility making it difficult to evaluate the effectiveness of activities improving the data. To describe the method for generating and involving the user community in developing a proposed framework for evaluation and communication of healthcare dataset utility for given research areas.
Aninitial version of a matrix to review datasets across a range of dimensions wasdeveloped based on previous published findings regarding healthcare data. Thiswas used to initiate a design process through interviews and surveys with datausers representing a broad range of user types and use cases, to help develop afocused framework for characterising datasets.
Following 21 interviews, 31 survey responses and testing on 43 datasets, five major categories and 13 subcategories were identified as useful for a dataset, including Data Model, Completeness and Linkage. Each sub-category was graded to facilitate rapid and reproducible evaluation of dataset utility for specific use-cases. Testing of applicability to >40 existing datasets demonstrated potential usefulness for subsequent evaluation in real-world practice.
Theresearch has developed an evidenced-based initial approach for a framework tounderstand the utility of a healthcare dataset. It likely to require further refinementfollowing wider application and additional categories may be required.
The process has resulted in a user-centred designed framework for objectively evaluating the likely utility of specific healthcare datasets, and therefore, should be of value both for potential users of health data, and for data custodians to identify the areas to provide the optimal value for data curation investment.
医疗保健数据的价值正日益受到重视,包括提高健康数据集实用性的必要性。目前还没有评估医疗保健数据集实用性的既定机制,因此难以评估提高数据实用性的活动的有效性。本研究旨在描述一种生成方法,并让用户社区参与制定一个针对特定研究领域的医疗保健数据集实用性评估和交流的框架。
根据之前关于医疗保健数据的研究结果,我们初步开发了一个用于跨多个维度审查数据集的矩阵。该矩阵用于通过对具有广泛用户类型和用例的数据用户进行访谈和调查,启动一个设计过程,以帮助开发一个用于描述数据集的重点框架。
经过 21 次访谈、31 次调查回复和对 43 个数据集的测试,确定了五个主要类别和 13 个子类别对数据集有用,包括数据模型、完整性和链接。每个子类别都进行了评分,以方便针对特定用例快速和可重复地评估数据集的实用性。对 40 多个现有数据集的适用性测试表明,该框架在实际应用中具有潜在的评估价值。
该研究基于循证方法开发了一个用于理解医疗保健数据集实用性的初始框架。在更广泛的应用后可能需要进一步细化,并且可能需要添加其他类别。
该过程产生了一个以用户为中心的设计框架,用于客观评估特定医疗保健数据集的实用性,因此,对于健康数据的潜在用户和数据保管人来说,都具有价值,可以确定数据策管投资的最佳价值领域。