标准化临床实验室数据的二次使用。

Standardizing clinical laboratory data for secondary use.

机构信息

Lister Hill National Center for Biomedical Communications, National Library of Medicine, 8600 Rockville Pike, Building 38A/7N707, Bethesda, MD 20894, USA.

出版信息

J Biomed Inform. 2012 Aug;45(4):642-50. doi: 10.1016/j.jbi.2012.04.012. Epub 2012 May 3.

DOI:10.1016/j.jbi.2012.04.012

PMID:22561944

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3419308/

Abstract

Clinical databases provide a rich source of data for answering clinical research questions. However, the variables recorded in clinical data systems are often identified by local, idiosyncratic, and sometimes redundant and/or ambiguous names (or codes) rather than unique, well-organized codes from standard code systems. This reality discourages research use of such databases, because researchers must invest considerable time in cleaning up the data before they can ask their first research question. Researchers at MIT developed MIMIC-II, a nearly complete collection of clinical data about intensive care patients. Because its data are drawn from existing clinical systems, it has many of the problems described above. In collaboration with the MIT researchers, we have begun a process of cleaning up the data and mapping the variable names and codes to LOINC codes. Our first step, which we describe here, was to map all of the laboratory test observations to LOINC codes. We were able to map 87% of the unique laboratory tests that cover 94% of the total number of laboratory tests results. Of the 13% of tests that we could not map, nearly 60% were due to test names whose real meaning could not be discerned and 29% represented tests that were not yet included in the LOINC table. These results suggest that LOINC codes cover most of laboratory tests used in critical care. We have delivered this work to the MIMIC-II researchers, who have included it in their standard MIMIC-II database release so that researchers who use this database in the future will not have to do this work.

摘要

临床数据库为回答临床研究问题提供了丰富的数据来源。然而，临床数据系统中记录的变量通常是由本地、特殊的、有时是冗余和/或模糊的名称（或代码）来标识的，而不是来自标准代码系统的独特、组织良好的代码。这种现实情况阻碍了对这些数据库的研究利用，因为研究人员必须投入大量时间清理数据，然后才能提出第一个研究问题。麻省理工学院的研究人员开发了 MIMIC-II，这是一个关于重症监护患者的几乎完整的临床数据集合。由于其数据来自现有的临床系统，因此它具有上述许多问题。我们与麻省理工学院的研究人员合作，开始了清理数据并将变量名称和代码映射到 LOINC 代码的过程。我们的第一步，如前所述，是将所有实验室测试观察结果映射到 LOINC 代码。我们能够映射 87%的独特实验室测试，涵盖了总实验室测试结果的 94%。在我们无法映射的 13%的测试中，近 60%是由于测试名称的实际含义无法辨别，29%代表尚未包含在 LOINC 表中的测试。这些结果表明，LOINC 代码涵盖了重症监护中使用的大多数实验室测试。我们已经将这项工作交付给 MIMIC-II 研究人员，他们已经将其包含在他们的标准 MIMIC-II 数据库版本中，以便将来使用该数据库的研究人员不必进行这项工作。

相似文献

Standardizing clinical laboratory data for secondary use.标准化临床实验室数据的二次使用。

J Biomed Inform. 2012 Aug;45(4):642-50. doi: 10.1016/j.jbi.2012.04.012. Epub 2012 May 3.

LOINC, a universal standard for identifying laboratory observations: a 5-year update.LOINC，一种用于识别实验室检查结果的通用标准：5年更新情况

Clin Chem. 2003 Apr;49(4):624-33. doi: 10.1373/49.4.624.

Auditing consistency and usefulness of LOINC use among three large institutions - using version spaces for grouping LOINC codes.审核三个大型机构中 LOINC 使用的一致性和有用性——使用版本空间对 LOINC 代码进行分组。

J Biomed Inform. 2012 Aug;45(4):658-66. doi: 10.1016/j.jbi.2012.01.008. Epub 2012 Jan 28.

Logical Observation Identifiers Names and Codes for Laboratorians.逻辑观察标识符命名与检验规范

Arch Pathol Lab Med. 2020 Feb;144(2):229-239. doi: 10.5858/arpa.2018-0477-RA. Epub 2019 Jun 20.

A Survey of LOINC Code Selection Practices Among Participants of the College of American Pathologists Coagulation (CGL) and Cardiac Markers (CRT) Proficiency Testing Programs.美国病理学家学会（College of American Pathologists，CAP）凝血（CGL）和心脏标志物（Cardiac Markers，CRT）能力验证计划参与者中 LOINC 代码选择实践的调查。

Arch Pathol Lab Med. 2020 May;144(5):586-596. doi: 10.5858/arpa.2019-0276-OA. Epub 2019 Oct 11.

Using Logical Observation Identifier Names and Codes (LOINC) to exchange laboratory data among three academic hospitals.使用逻辑观察标识符名称和代码（LOINC）在三家学术医院之间交换实验室数据。

Proc AMIA Annu Fall Symp. 1997:96-100.

Development of the Logical Observation Identifier Names and Codes (LOINC) vocabulary.逻辑观察标识符名称和编码（LOINC）词汇表的开发。

J Am Med Inform Assoc. 1998 May-Jun;5(3):276-92. doi: 10.1136/jamia.1998.0050276.

Quality assurance of LOINC mapping for laboratory tests - a local experience with people, process and technology.实验室检测的LOINC映射质量保证——关于人员、流程和技术的本地经验

Stud Health Technol Inform. 2013;192:975.

Electronic clinical laboratory test results data tables: lessons from Mini-Sentinel.电子临床检验结果数据表：来自Mini-Sentinel的经验教训

Pharmacoepidemiol Drug Saf. 2014 Jun;23(6):609-18. doi: 10.1002/pds.3580. Epub 2014 Feb 18.

Automated mapping of laboratory tests to LOINC codes using noisy labels in a national electronic health record system database.利用国家电子健康记录系统数据库中的噪声标签对实验室检测进行自动化 LOINC 编码映射。

J Am Med Inform Assoc. 2018 Oct 1;25(10):1292-1300. doi: 10.1093/jamia/ocy110.

引用本文的文献

Enhancing Healthcare Data Integration: A Machine Learning Approach to Harmonizing Laboratory Labels.增强医疗保健数据集成：一种用于协调实验室标签的机器学习方法。

AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:65-73. eCollection 2025.

lab2clean: a novel algorithm for automated cleaning of retrospective clinical laboratory results data for secondary uses.lab2clean：一种用于回顾性临床实验室结果数据自动清洗的新型算法，以支持二次利用。

BMC Med Inform Decis Mak. 2024 Sep 3;24(1):245. doi: 10.1186/s12911-024-02652-7.

Natural Language Processing for Clinical Laboratory Data Repository Systems: Implementation and Evaluation for Respiratory Viruses.临床实验室数据存储系统的自然语言处理：呼吸道病毒的实施与评估

JMIR AI. 2023 Jun 6;2:e44835. doi: 10.2196/44835.

A SARS-CoV-2 minimum data standard to support national serology reporting.支持国家血清学报告的 SARS-CoV-2 最小数据标准。

Ann Clin Biochem. 2024 Nov;61(6):418-445. doi: 10.1177/00045632241261274. Epub 2024 Jun 17.

Why do probabilistic clinical models fail to transport between sites.为什么概率性临床模型无法在不同地点之间进行迁移？

NPJ Digit Med. 2024 Mar 1;7(1):53. doi: 10.1038/s41746-024-01037-4.

Toxicology Test Results for Public Health Surveillance of the Opioid Epidemic: Retrospective Analysis.阿片类药物流行的公共卫生监测毒理学测试结果：回顾性分析

Online J Public Health Inform. 2023 Sep 28;15:e50936. doi: 10.2196/50936. eCollection 2023.

lab: an R package for generating analysis-ready data from laboratory records.实验室：一个用于从实验室记录生成可用于分析的数据的R软件包。

PeerJ Comput Sci. 2023 Aug 25;9:e1528. doi: 10.7717/peerj-cs.1528. eCollection 2023.

BGLM: big data-guided LOINC mapping with multi-language support.BGLM：具有多语言支持的大数据引导的LOINC映射

JAMIA Open. 2022 Nov 25;5(4):ooac099. doi: 10.1093/jamiaopen/ooac099. eCollection 2022 Dec.

Survival Analysis with Electronic Health Record Data: Experiments with Chronic Kidney Disease.利用电子健康记录数据进行生存分析：慢性肾脏病实验

Stat Anal Data Min. 2014 Oct;7(5):385-403. doi: 10.1002/sam.11236. Epub 2014 Aug 19.

Including household effects in Big Data research: the experience of building a longitudinal residence algorithm using linked administrative data in Wales.在大数据研究中纳入家庭效应：利用威尔士的关联行政数据构建纵向居住算法的经验。

Int J Popul Data Sci. 2018 Nov 20;3(1):452. doi: 10.23889/ijpds.v3i1.452.

本文引用的文献

Using LOINC to link 10 terminology standards to one unified standard in a specialized domain.使用 LOINC 将 10 个术语标准链接到一个专门领域的统一标准。

J Biomed Inform. 2012 Aug;45(4):674-82. doi: 10.1016/j.jbi.2011.10.003. Epub 2011 Oct 19.

Data mapping best practices.数据映射最佳实践。

J AHIMA. 2011 Apr;82(4):46-52.

Mapping clinical phenotype data elements to standardized metadata repositories and controlled terminologies: the eMERGE Network experience.将临床表型数据元素映射到标准化元数据存储库和受控术语：eMERGE 网络的经验。

J Am Med Inform Assoc. 2011 Jul-Aug;18(4):376-86. doi: 10.1136/amiajnl-2010-000061. Epub 2011 May 19.

Standardizing Clinical Document Names Using the HL7/LOINC Document Ontology and LOINC Codes.使用HL7/LOINC文档本体和LOINC代码对临床文档名称进行标准化。

AMIA Annu Symp Proc. 2010 Nov 13;2010:101-5.

Multiparameter Intelligent Monitoring in Intensive Care II: a public-access intensive care unit database.多参数智能监护在重症监护中的应用 II：一个公共接入重症监护病房数据库。

Crit Care Med. 2011 May;39(5):952-60. doi: 10.1097/CCM.0b013e31820a92c6.

Genetic basis of autoantibody positive and negative rheumatoid arthritis risk in a multi-ethnic cohort derived from electronic health records.基于电子病历的多民族队列中自身抗体阳性和阴性类风湿关节炎风险的遗传基础。

Am J Hum Genet. 2011 Jan 7;88(1):57-69. doi: 10.1016/j.ajhg.2010.12.007.

Migrating existing clinical content from ICD-9 to SNOMED.将现有临床内容从 ICD-9 迁移到 SNOMED。

J Am Med Inform Assoc. 2010 Sep-Oct;17(5):602-7. doi: 10.1136/jamia.2009.001057.

A characterization of local LOINC mapping for laboratory tests in three large institutions.三大机构中实验室检测的局部LOINC映射特征分析

Methods Inf Med. 2011;50(2):105-14. doi: 10.3414/ME09-01-0072. Epub 2010 Aug 20.

Automated de-identification of free-text medical records.自由文本医疗记录的自动去识别化

BMC Med Inform Decis Mak. 2008 Jul 24;8:32. doi: 10.1186/1472-6947-8-32.

Standards for privacy of individually identifiable health information. Final rule.可识别个人身份的健康信息隐私标准。最终规则。

Fed Regist. 2002 Aug 14;67(157):53181-273.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验