• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

lab2clean:一种用于回顾性临床实验室结果数据自动清洗的新型算法,以支持二次利用。

lab2clean: a novel algorithm for automated cleaning of retrospective clinical laboratory results data for secondary uses.

机构信息

Department of Public Health and Primary Care, KU Leuven, Leuven, Belgium.

Laboratory Medicine Department, Menoufia University National Liver Institute, Shebin El-Kom, Egypt.

出版信息

BMC Med Inform Decis Mak. 2024 Sep 3;24(1):245. doi: 10.1186/s12911-024-02652-7.

DOI:10.1186/s12911-024-02652-7
PMID:39227951
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11370074/
Abstract

BACKGROUND

The integrity of clinical research and machine learning models in healthcare heavily relies on the quality of underlying clinical laboratory data. However, the preprocessing of this data to ensure its reliability and accuracy remains a significant challenge due to variations in data recording and reporting standards.

METHODS

We developed lab2clean, a novel algorithm aimed at automating and standardizing the cleaning of retrospective clinical laboratory results data. lab2clean was implemented as two R functions specifically designed to enhance data conformance and plausibility by standardizing result formats and validating result values. The functionality and performance of the algorithm were evaluated using two extensive electronic medical record (EMR) databases, encompassing various clinical settings.

RESULTS

lab2clean effectively reduced the variability of laboratory results and identified potentially erroneous records. Upon deployment, it demonstrated effective and fast standardization and validation of substantial laboratory data records. The evaluation highlighted significant improvements in the conformance and plausibility of lab results, confirming the algorithm's efficacy in handling large-scale data sets.

CONCLUSIONS

lab2clean addresses the challenge of preprocessing and cleaning clinical laboratory data, a critical step in ensuring high-quality data for research outcomes. It offers a straightforward, efficient tool for researchers, improving the quality of clinical laboratory data, a major portion of healthcare data. Thereby, enhancing the reliability and reproducibility of clinical research outcomes and clinical machine learning models. Future developments aim to broaden its functionality and accessibility, solidifying its vital role in healthcare data management.

摘要

背景

临床研究和医疗保健领域机器学习模型的完整性在很大程度上依赖于底层临床实验室数据的质量。然而,由于数据记录和报告标准的差异,对这些数据进行预处理以确保其可靠性和准确性仍然是一个重大挑战。

方法

我们开发了 lab2clean,这是一种旨在自动化和标准化回顾性临床实验室结果数据清理的新算法。lab2clean 实现为两个专门设计的 R 函数,旨在通过标准化结果格式和验证结果值来增强数据一致性和合理性。该算法的功能和性能使用两个广泛的电子病历 (EMR) 数据库进行评估,涵盖了各种临床环境。

结果

lab2clean 有效地减少了实验室结果的可变性,并确定了潜在的错误记录。在部署后,它展示了对大量实验室数据记录的有效和快速标准化和验证。评估突出了实验室结果一致性和合理性的显著改善,证实了该算法在处理大规模数据集方面的有效性。

结论

lab2clean 解决了预处理和清理临床实验室数据的挑战,这是确保研究结果高质量数据的关键步骤。它为研究人员提供了一种简单、高效的工具,提高了临床实验室数据的质量,这是医疗保健数据的主要部分。从而提高临床研究结果和临床机器学习模型的可靠性和可重复性。未来的发展旨在扩大其功能和可访问性,巩固其在医疗保健数据管理中的重要作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d633/11370074/2f5a3b00b482/12911_2024_2652_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d633/11370074/f275731fc1d6/12911_2024_2652_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d633/11370074/2f5a3b00b482/12911_2024_2652_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d633/11370074/f275731fc1d6/12911_2024_2652_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d633/11370074/2f5a3b00b482/12911_2024_2652_Fig2_HTML.jpg

相似文献

1
lab2clean: a novel algorithm for automated cleaning of retrospective clinical laboratory results data for secondary uses.lab2clean:一种用于回顾性临床实验室结果数据自动清洗的新型算法,以支持二次利用。
BMC Med Inform Decis Mak. 2024 Sep 3;24(1):245. doi: 10.1186/s12911-024-02652-7.
2
Automated Fall Detection Algorithm With Global Trigger Tool, Incident Reports, Manual Chart Review, and Patient-Reported Falls: Algorithm Development and Validation With a Retrospective Diagnostic Accuracy Study.基于全球触发工具、事件报告、手动图表审查和患者报告的跌倒的自动跌倒检测算法:回顾性诊断准确性研究的算法开发和验证。
J Med Internet Res. 2020 Sep 21;22(9):e19516. doi: 10.2196/19516.
3
Prediction of myopia development among Chinese school-aged children using refraction data from electronic medical records: A retrospective, multicentre machine learning study.基于电子病历中的屈光数据预测中国学龄儿童近视进展:一项回顾性、多中心机器学习研究。
PLoS Med. 2018 Nov 6;15(11):e1002674. doi: 10.1371/journal.pmed.1002674. eCollection 2018 Nov.
4
A clustering approach for detecting implausible observation values in electronic health records data.一种用于检测电子健康记录数据中不合理观测值的聚类方法。
BMC Med Inform Decis Mak. 2019 Jul 23;19(1):142. doi: 10.1186/s12911-019-0852-6.
5
Deep Learning With Electronic Health Records for Short-Term Fracture Risk Identification: Crystal Bone Algorithm Development and Validation.基于电子健康记录的深度学习短期骨折风险识别:Crystal Bone 算法的开发与验证。
J Med Internet Res. 2020 Oct 16;22(10):e22550. doi: 10.2196/22550.
6
[Standard technical specifications for methacholine chloride (Methacholine) bronchial challenge test (2023)].[氯化乙酰甲胆碱支气管激发试验标准技术规范(2023年)]
Zhonghua Jie He He Hu Xi Za Zhi. 2024 Feb 12;47(2):101-119. doi: 10.3760/cma.j.cn112147-20231019-00247.
7
Development and validation of an algorithm for classifying colonoscopy indication.一种用于结肠镜检查适应证分类算法的开发与验证
Gastrointest Endosc. 2015 Mar;81(3):575-582.e4. doi: 10.1016/j.gie.2014.07.031. Epub 2015 Jan 8.
8
A Machine Learning Algorithm Predicting Acute Kidney Injury in Intensive Care Unit Patients (NAVOY Acute Kidney Injury): Proof-of-Concept Study.一种预测重症监护病房患者急性肾损伤的机器学习算法(NAVOY急性肾损伤):概念验证研究。
JMIR Form Res. 2023 Dec 14;7:e45979. doi: 10.2196/45979.
9
Development and initial validation of a data quality evaluation tool in obstetrics real-world data through HL7-FHIR interoperable Bayesian networks and expert rules.通过HL7-FHIR可互操作贝叶斯网络和专家规则开发并初步验证产科真实世界数据中的数据质量评估工具
JAMIA Open. 2024 Jul 27;7(3):ooae062. doi: 10.1093/jamiaopen/ooae062. eCollection 2024 Oct.
10
COVID-19 Mortality Prediction From Deep Learning in a Large Multistate Electronic Health Record and Laboratory Information System Data Set: Algorithm Development and Validation.基于大型多状态电子健康记录和实验室信息系统数据集的深度学习预测 COVID-19 死亡率:算法开发与验证。
J Med Internet Res. 2021 Sep 28;23(9):e30157. doi: 10.2196/30157.

引用本文的文献

1
Applications and challenges of biomarker-based predictive models in proactive health management.基于生物标志物的预测模型在主动健康管理中的应用与挑战
Front Public Health. 2025 Aug 18;13:1633487. doi: 10.3389/fpubh.2025.1633487. eCollection 2025.

本文引用的文献

1
Establishing the Reportable Interval for Routine Clinical Laboratory Tests: A Data-Driven Strategy Leveraging Retrospective Electronic Medical Record Data.确定常规临床实验室检测的报告区间:一种利用回顾性电子病历数据的数据驱动策略。
J Appl Lab Med. 2024 Jul 1;9(4):776-788. doi: 10.1093/jalm/jfae021.
2
MIMIC-IV, a freely accessible electronic health record dataset.MIMIC-IV,一个可自由访问的电子健康记录数据集。
Sci Data. 2023 Jan 3;10(1):1. doi: 10.1038/s41597-022-01899-x.
3
Development and Implementation of a Standard Format for Clinical Laboratory Test Results.
临床实验室检验结果标准格式的制定与实施。
Am J Clin Pathol. 2022 Sep 2;158(3):409-415. doi: 10.1093/ajcp/aqac067.
4
Combined strategy of knowledge-based rule selection and historical data percentile-based range determination to improve an autoverification system for clinical chemistry test results.基于知识规则选择和历史数据百分位范围确定的联合策略,以改进临床化学检验结果自动验证系统。
J Clin Lab Anal. 2022 Feb;36(2):e24233. doi: 10.1002/jcla.24233. Epub 2022 Jan 10.
5
The Surprising Absence of a Laboratory Result Standard.
Am J Clin Pathol. 2022 May 4;157(5):642-643. doi: 10.1093/ajcp/aqab198.
6
An automated data cleaning method for Electronic Health Records by incorporating clinical knowledge.结合临床知识的电子健康记录自动化数据清洗方法。
BMC Med Inform Decis Mak. 2021 Sep 17;21(1):267. doi: 10.1186/s12911-021-01630-7.
7
A clustering approach for detecting implausible observation values in electronic health records data.一种用于检测电子健康记录数据中不合理观测值的聚类方法。
BMC Med Inform Decis Mak. 2019 Jul 23;19(1):142. doi: 10.1186/s12911-019-0852-6.
8
Delta Checks in the clinical laboratory.临床实验室中的德尔塔检查。
Crit Rev Clin Lab Sci. 2019 Mar;56(2):75-97. doi: 10.1080/10408363.2018.1540536. Epub 2019 Jan 11.
9
LabRS: A Rosetta stone for retrospective standardization of clinical laboratory test results.LabRS:一种回顾性临床实验室检验结果标准化的罗塞塔石碑。
J Am Med Inform Assoc. 2018 Feb 1;25(2):121-126. doi: 10.1093/jamia/ocx046.
10
A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data.电子健康记录数据二次使用的统一数据质量评估术语和框架。
EGEMS (Wash DC). 2016 Sep 11;4(1):1244. doi: 10.13063/2327-9214.1244. eCollection 2016.