• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用自动化算法清理PCORnet电子健康记录中的人体测量数据。

Cleaning of anthropometric data from PCORnet electronic health records using automated algorithms.

作者信息

Lin Pi-I D, Rifas-Shiman Sheryl L, Aris Izzuddin M, Daley Matthew F, Janicke David M, Heerman William J, Chudnov Daniel L, Freedman David S, Block Jason P

机构信息

Division of Chronic Disease Research Across the Lifecourse (CoRAL), Department of Population Medicine, Harvard Medical School and Harvard Pilgrim Health Care Institute, Boston, Massachusetts, USA.

Institute for Health Research, Kaiser Permanente Colorado, Aurora, Colorado, USA.

出版信息

JAMIA Open. 2022 Nov 2;5(4):ooac089. doi: 10.1093/jamiaopen/ooac089. eCollection 2022 Dec.

DOI:10.1093/jamiaopen/ooac089
PMID:36339053
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9629892/
Abstract

OBJECTIVE

To demonstrate the utility of , an anthropometric data cleaning method designed for electronic health records (EHR).

MATERIALS AND METHODS

We used all available pediatric and adult height and weight data from an ongoing observational study that includes EHR data from 15 healthcare systems and applied to identify outliers and errors and compared its performance in pediatric data with 2 other pediatric data cleaning methods: (1) conditional percentile () and (2) PaEdiatric ANthropometric measurement Outlier Flagging pipeline ().

RESULTS

687 226 children (<20 years) and 3 267 293 adults contributed 71 246 369 weight and 51 525 487 height measurements. flagged 18% of pediatric and 12% of adult measurements for exclusion, mostly as carried-forward measures for pediatric data and duplicates for adult and pediatric data. After removing the flagged measurements, 0.5% and 0.6% of the pediatric heights and weights and 0.3% and 1.4% of the adult heights and weights, respectively, were biologically implausible according to the CDC and other established cut points. Compared with other pediatric cleaning methods, flagged the most measurements for exclusion; however, it did not flag some more extreme measurements. The prevalence of severe pediatric obesity was 9.0%, 9.2%, and 8.0% after cleaning by , , and , respectively.

CONCLUSION

is useful for cleaning pediatric and adult height and weight data. It is the only method with the ability to clean adult data and identify carried-forward and duplicates, which are prevalent in EHR. Findings of this study can be used to improve the algorithm.

摘要

目的

证明一种为电子健康记录(EHR)设计的人体测量数据清理方法的实用性。

材料与方法

我们使用了一项正在进行的观察性研究中的所有可用儿科和成人身高及体重数据,该研究包括来自15个医疗系统的EHR数据,并应用该方法识别异常值和错误,并将其在儿科数据中的性能与其他两种儿科数据清理方法进行比较:(1)条件百分位数()和(2)儿科人体测量异常值标记管道()。

结果

687226名儿童(<20岁)和3267293名成人贡献了71246369次体重测量和51525487次身高测量。该方法标记了18%的儿科测量值和12%的成人测量值以供排除,主要是作为儿科数据的结转测量值以及成人和儿科数据的重复项。在去除标记的测量值后,根据疾病控制与预防中心(CDC)和其他既定切点,分别有0.5%和0.6%的儿科身高和体重以及0.3%和1.4%的成人身高和体重在生物学上是不合理的。与其他儿科清理方法相比,该方法标记以供排除的测量值最多;然而,它没有标记一些更极端的测量值。分别采用该方法、和进行清理后,严重儿科肥胖的患病率分别为9.0%、9.2%和8.0%。

结论

该方法对于清理儿科和成人身高及体重数据很有用。它是唯一能够清理成人数据并识别结转和重复项的方法,这些在EHR中很常见。本研究结果可用于改进该算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6be/9629892/20b39e71e967/ooac089f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6be/9629892/20b39e71e967/ooac089f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d6be/9629892/20b39e71e967/ooac089f1.jpg

相似文献

1
Cleaning of anthropometric data from PCORnet electronic health records using automated algorithms.使用自动化算法清理PCORnet电子健康记录中的人体测量数据。
JAMIA Open. 2022 Nov 2;5(4):ooac089. doi: 10.1093/jamiaopen/ooac089. eCollection 2022 Dec.
2
Identifying biologically implausible values in big longitudinal data: an example applied to child growth data from the Brazilian food and nutrition surveillance system.识别大型纵向数据中的生物学上不合理的值:应用于巴西食品和营养监测系统儿童生长数据的示例。
BMC Med Res Methodol. 2024 Feb 15;24(1):38. doi: 10.1186/s12874-024-02161-1.
3
Automated data cleaning of paediatric anthropometric data from longitudinal electronic health records: protocol and application to a large patient cohort.从纵向电子健康记录中自动清理儿科人体测量数据:方案和在大型患者队列中的应用。
Sci Rep. 2020 Jun 23;10(1):10164. doi: 10.1038/s41598-020-66925-7.
4
Identifying erroneous height and weight values from adult electronic health records in the All of Us research program.从“我们所有人”研究计划中的成人电子健康记录中识别错误的身高和体重值。
J Biomed Inform. 2024 Jul;155:104660. doi: 10.1016/j.jbi.2024.104660. Epub 2024 May 23.
5
Automated identification of implausible values in growth data from pediatric electronic health records.自动识别儿科电子健康记录中生长数据的不合理值。
J Am Med Inform Assoc. 2017 Nov 1;24(6):1080-1087. doi: 10.1093/jamia/ocx037.
6
A better performing algorithm for identification of implausible growth data from longitudinal pediatric medical records.一种能够更好地识别纵向儿科医疗记录中不合理增长数据的算法。
Sci Rep. 2024 Aug 6;14(1):18276. doi: 10.1038/s41598-024-69161-5.
7
Is it time to stop sweeping data cleaning under the carpet? A novel algorithm for outlier management in growth data.是否是时候停止将数据清理问题掩盖起来了?一种新的生长数据异常值管理算法。
PLoS One. 2020 Jan 24;15(1):e0228154. doi: 10.1371/journal.pone.0228154. eCollection 2020.
8
A rigorous algorithm to detect and clean inaccurate adult height records within EHR systems.一种用于检测和清理电子健康记录(EHR)系统中不准确成人身高记录的严格算法。
Appl Clin Inform. 2014 Feb 19;5(1):118-26. doi: 10.4338/ACI-2013-09-RA-0074. eCollection 2014.
9
Novel Pediatric Height Outlier Detection Methodology for Electronic Health Records via Machine Learning With Monotonic Bayesian Additive Regression Trees.基于单调贝叶斯加法回归树的机器学习在电子健康记录中用于儿科身高离群值检测的新方法。
J Pediatr Gastroenterol Nutr. 2022 Aug 1;75(2):210-214. doi: 10.1097/MPG.0000000000003492. Epub 2022 Jun 1.
10
A big-data approach to producing descriptive anthropometric references: a feasibility and validation study of paediatric growth charts.大数据方法生成描述性人体测量参考值:儿科生长图表的可行性和验证研究。
Lancet Digit Health. 2019 Dec;1(8):e413-e423. doi: 10.1016/S2589-7500(19)30149-9. Epub 2019 Nov 7.

引用本文的文献

1
EHRchitect: An open-source software tool for medical event sequences data extraction from Electronic Health Records.EHRchitect:一种用于从电子健康记录中提取医疗事件序列数据的开源软件工具。
J Clin Transl Sci. 2025 Mar 26;9(1):e79. doi: 10.1017/cts.2025.55. eCollection 2025.
2
Data from Routine Primary Healthcare: Opportunities and Threats for Obesity Epidemiological Research.来自常规初级医疗保健的数据:肥胖流行病学研究的机遇与威胁
Port J Public Health. 2025 Apr 2:1-4. doi: 10.1159/000545337.
3
Advanced applications in chronic disease monitoring using IoT mobile sensing device data, machine learning algorithms and frame theory: a systematic review.

本文引用的文献

1
Development and Evaluation of an Automated Approach to Detect Weight Abnormalities in Pediatric Weight Charts.开发和评估一种自动方法,以检测儿科体重图表中的体重异常。
AMIA Annu Symp Proc. 2022 Feb 21;2021:783-792. eCollection 2021.
2
Deriving Weight From Big Data: Comparison of Body Weight Measurement-Cleaning Algorithms.从大数据中推导体重:体重测量-清理算法的比较
JMIR Med Inform. 2022 Mar 9;10(3):e30328. doi: 10.2196/30328.
3
An automated data cleaning method for Electronic Health Records by incorporating clinical knowledge.
利用物联网移动传感设备数据、机器学习算法和框架理论在慢性病监测中的高级应用:一项系统综述。
Front Public Health. 2025 Feb 21;13:1510456. doi: 10.3389/fpubh.2025.1510456. eCollection 2025.
4
A Digital Health Behavior Intervention to Prevent Childhood Obesity: The Greenlight Plus Randomized Clinical Trial.一项预防儿童肥胖的数字健康行为干预措施:绿灯加随机临床试验。
JAMA. 2024 Dec 24;332(24):2068-2080. doi: 10.1001/jama.2024.22362.
5
Medication-Induced Weight Change Across Common Antidepressant Treatments : A Target Trial Emulation Study.常见抗抑郁治疗药物引起的体重变化:一项目标试验模拟研究。
Ann Intern Med. 2024 Aug;177(8):993-1003. doi: 10.7326/M23-2742. Epub 2024 Jul 2.
6
Identifying erroneous height and weight values from adult electronic health records in the All of Us research program.从“我们所有人”研究计划中的成人电子健康记录中识别错误的身高和体重值。
J Biomed Inform. 2024 Jul;155:104660. doi: 10.1016/j.jbi.2024.104660. Epub 2024 May 23.
7
Trends of Antihypertensive Prescription Among US Adults From 2010 to 2019 and Changes Following Treatment Guidelines: Analysis of Multicenter Electronic Health Records.2010 年至 2019 年美国成年人抗高血压处方趋势及治疗指南更新后的变化:多中心电子健康记录分析。
J Am Heart Assoc. 2024 May 7;13(9):e032197. doi: 10.1161/JAHA.123.032197. Epub 2024 Apr 19.
8
Data quality control in longitudinal epidemiologic studies: conditional studentized residuals from linear mixed effects models for outlier detection in the setting of pediatric chronic kidney disease.纵向流行病学研究中的数据质量控制:小儿慢性肾脏病背景下线性混合效应模型条件学生化残差在异常值检测中的应用。
Ann Epidemiol. 2023 Sep;85:38-44. doi: 10.1016/j.annepidem.2023.07.005. Epub 2023 Jul 16.
结合临床知识的电子健康记录自动化数据清洗方法。
BMC Med Inform Decis Mak. 2021 Sep 17;21(1):267. doi: 10.1186/s12911-021-01630-7.
4
Diet quality of Norwegian children at 3 and 7 years: changes, predictors and longitudinal association with weight.挪威儿童 3 岁和 7 岁时的饮食质量:变化、预测因素及与体重的纵向关联。
Int J Obes (Lond). 2022 Jan;46(1):10-20. doi: 10.1038/s41366-021-00951-x. Epub 2021 Aug 30.
5
Association of Early Antibiotic Exposure With Childhood Body Mass Index Trajectory Milestones.早期抗生素暴露与儿童体重指数轨迹里程碑的关联。
JAMA Netw Open. 2021 Jul 1;4(7):e2116581. doi: 10.1001/jamanetworkopen.2021.16581.
6
Maternal seafood intake during pregnancy, prenatal mercury exposure and child body mass index trajectories up to 8 years.孕期母体海鲜摄入量、产前汞暴露与儿童 8 年内体重指数轨迹
Int J Epidemiol. 2021 Aug 30;50(4):1134-1146. doi: 10.1093/ije/dyab035.
7
Democratizing EHR analyses with FIDDLE: a flexible data-driven preprocessing pipeline for structured clinical data.通过FIDDLE实现电子健康记录分析的普及:一种用于结构化临床数据的灵活的数据驱动预处理管道。
J Am Med Inform Assoc. 2020 Dec 9;27(12):1921-1934. doi: 10.1093/jamia/ocaa139.
8
PCORnet® 2020: current state, accomplishments, and future directions.PCORnet® 2020:现状、成就和未来方向。
J Clin Epidemiol. 2021 Jan;129:60-67. doi: 10.1016/j.jclinepi.2020.09.036. Epub 2020 Sep 28.
9
Automated data cleaning of paediatric anthropometric data from longitudinal electronic health records: protocol and application to a large patient cohort.从纵向电子健康记录中自动清理儿科人体测量数据:方案和在大型患者队列中的应用。
Sci Rep. 2020 Jun 23;10(1):10164. doi: 10.1038/s41598-020-66925-7.
10
A Comparison of Existing Methods to Detect Weight Data Errors in a Pediatric Academic Medical Center.儿科学术医疗中心中检测体重数据错误的现有方法比较
AMIA Annu Symp Proc. 2018 Dec 5;2018:1103-1109. eCollection 2018.