Suppr超能文献

lab2clean:一种用于回顾性临床实验室结果数据自动清洗的新型算法,以支持二次利用。

lab2clean: a novel algorithm for automated cleaning of retrospective clinical laboratory results data for secondary uses.

机构信息

Department of Public Health and Primary Care, KU Leuven, Leuven, Belgium.

Laboratory Medicine Department, Menoufia University National Liver Institute, Shebin El-Kom, Egypt.

出版信息

BMC Med Inform Decis Mak. 2024 Sep 3;24(1):245. doi: 10.1186/s12911-024-02652-7.

Abstract

BACKGROUND

The integrity of clinical research and machine learning models in healthcare heavily relies on the quality of underlying clinical laboratory data. However, the preprocessing of this data to ensure its reliability and accuracy remains a significant challenge due to variations in data recording and reporting standards.

METHODS

We developed lab2clean, a novel algorithm aimed at automating and standardizing the cleaning of retrospective clinical laboratory results data. lab2clean was implemented as two R functions specifically designed to enhance data conformance and plausibility by standardizing result formats and validating result values. The functionality and performance of the algorithm were evaluated using two extensive electronic medical record (EMR) databases, encompassing various clinical settings.

RESULTS

lab2clean effectively reduced the variability of laboratory results and identified potentially erroneous records. Upon deployment, it demonstrated effective and fast standardization and validation of substantial laboratory data records. The evaluation highlighted significant improvements in the conformance and plausibility of lab results, confirming the algorithm's efficacy in handling large-scale data sets.

CONCLUSIONS

lab2clean addresses the challenge of preprocessing and cleaning clinical laboratory data, a critical step in ensuring high-quality data for research outcomes. It offers a straightforward, efficient tool for researchers, improving the quality of clinical laboratory data, a major portion of healthcare data. Thereby, enhancing the reliability and reproducibility of clinical research outcomes and clinical machine learning models. Future developments aim to broaden its functionality and accessibility, solidifying its vital role in healthcare data management.

摘要

背景

临床研究和医疗保健领域机器学习模型的完整性在很大程度上依赖于底层临床实验室数据的质量。然而,由于数据记录和报告标准的差异,对这些数据进行预处理以确保其可靠性和准确性仍然是一个重大挑战。

方法

我们开发了 lab2clean,这是一种旨在自动化和标准化回顾性临床实验室结果数据清理的新算法。lab2clean 实现为两个专门设计的 R 函数,旨在通过标准化结果格式和验证结果值来增强数据一致性和合理性。该算法的功能和性能使用两个广泛的电子病历 (EMR) 数据库进行评估,涵盖了各种临床环境。

结果

lab2clean 有效地减少了实验室结果的可变性,并确定了潜在的错误记录。在部署后,它展示了对大量实验室数据记录的有效和快速标准化和验证。评估突出了实验室结果一致性和合理性的显著改善,证实了该算法在处理大规模数据集方面的有效性。

结论

lab2clean 解决了预处理和清理临床实验室数据的挑战,这是确保研究结果高质量数据的关键步骤。它为研究人员提供了一种简单、高效的工具,提高了临床实验室数据的质量,这是医疗保健数据的主要部分。从而提高临床研究结果和临床机器学习模型的可靠性和可重复性。未来的发展旨在扩大其功能和可访问性,巩固其在医疗保健数据管理中的重要作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d633/11370074/f275731fc1d6/12911_2024_2652_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验