• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

结合傅里叶变换和滞后k近邻插补法处理生物医学时间序列数据

Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data.

作者信息

Rahman Shah Atiqur, Huang Yuxiao, Claassen Jan, Heintzman Nathaniel, Kleinberg Samantha

机构信息

Department of Computer Science, Stevens Institute of Technology, NJ, United States.

Division of Critical Care Neurology, Department of Neurology, Columbia University, College of Physicians and Surgeons, New York, NY, United States.

出版信息

J Biomed Inform. 2015 Dec;58:198-207. doi: 10.1016/j.jbi.2015.10.004. Epub 2015 Oct 21.

DOI:10.1016/j.jbi.2015.10.004
PMID:26477633
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4755282/
Abstract

Most clinical and biomedical data contain missing values. A patient's record may be split across multiple institutions, devices may fail, and sensors may not be worn at all times. While these missing values are often ignored, this can lead to bias and error when the data are mined. Further, the data are not simply missing at random. Instead the measurement of a variable such as blood glucose may depend on its prior values as well as that of other variables. These dependencies exist across time as well, but current methods have yet to incorporate these temporal relationships as well as multiple types of missingness. To address this, we propose an imputation method (FLk-NN) that incorporates time lagged correlations both within and across variables by combining two imputation methods, based on an extension to k-NN and the Fourier transform. This enables imputation of missing values even when all data at a time point is missing and when there are different types of missingness both within and across variables. In comparison to other approaches on three biological datasets (simulated and actual Type 1 diabetes datasets, and multi-modality neurological ICU monitoring) the proposed method has the highest imputation accuracy. This was true for up to half the data being missing and when consecutive missing values are a significant fraction of the overall time series length.

摘要

大多数临床和生物医学数据都包含缺失值。患者的记录可能分散在多个机构,设备可能出现故障,传感器也可能并非一直佩戴。虽然这些缺失值常常被忽略,但在挖掘数据时这可能会导致偏差和错误。此外,数据并非简单地随机缺失。相反,诸如血糖等变量的测量可能取决于其先前的值以及其他变量的值。这些依赖关系在时间上也存在,但当前的方法尚未纳入这些时间关系以及多种类型的缺失情况。为了解决这个问题,我们提出了一种插补方法(FLk-NN),该方法通过结合两种插补方法,基于对k近邻法(k-NN)的扩展和傅里叶变换,纳入变量内部和变量之间的时间滞后相关性。这使得即使在某个时间点所有数据都缺失以及变量内部和变量之间存在不同类型的缺失时,也能够对缺失值进行插补。与在三个生物数据集(模拟和实际的1型糖尿病数据集以及多模态神经重症监护病房监测数据)上的其他方法相比,所提出的方法具有最高的插补精度。当高达一半的数据缺失以及连续缺失值占整个时间序列长度的很大一部分时,情况都是如此。

相似文献

1
Combining Fourier and lagged k-nearest neighbor imputation for biomedical time series data.结合傅里叶变换和滞后k近邻插补法处理生物医学时间序列数据
J Biomed Inform. 2015 Dec;58:198-207. doi: 10.1016/j.jbi.2015.10.004. Epub 2015 Oct 21.
2
Improve correlation matrix of Discrete Fourier Transformation technique for finding the missing values of MRI images.改进离散傅里叶变换技术的相关矩阵,以找到 MRI 图像中的缺失值。
Math Biosci Eng. 2022 Jun 22;19(9):9039-9059. doi: 10.3934/mbe.2022420.
3
Advanced methods for missing values imputation based on similarity learning.基于相似性学习的缺失值插补先进方法。
PeerJ Comput Sci. 2021 Jul 21;7:e619. doi: 10.7717/peerj-cs.619. eCollection 2021.
4
Missing value imputation in high-dimensional phenomic data: imputable or not, and how?高维表型组数据中的缺失值插补:是否可插补以及如何插补?
BMC Bioinformatics. 2014 Nov 5;15(1):346. doi: 10.1186/s12859-014-0346-6.
5
Nearest neighbor imputation algorithms: a critical evaluation.最近邻插补算法:批判性评估
BMC Med Inform Decis Mak. 2016 Jul 25;16 Suppl 3(Suppl 3):74. doi: 10.1186/s12911-016-0318-z.
6
Robust imputation method with context-aware voting ensemble model for management of water-quality data.具有上下文感知投票集成模型的稳健插补方法用于水质数据管理。
Water Res. 2023 Sep 1;243:120369. doi: 10.1016/j.watres.2023.120369. Epub 2023 Jul 16.
7
Distribution based nearest neighbor imputation for truncated high dimensional data with applications to pre-clinical and clinical metabolomics studies.基于分布的最近邻插补法用于截断高维数据及其在临床前和临床代谢组学研究中的应用
BMC Bioinformatics. 2017 Feb 20;18(1):114. doi: 10.1186/s12859-017-1547-6.
8
On mining incomplete medical datasets: Ordering imputation and classification.关于挖掘不完整医学数据集:排序插补与分类。
Technol Health Care. 2015;23(5):619-25. doi: 10.3233/THC-151018.
9
Imputation of missing values in lipidomic datasets.脂质组学数据集缺失值的推断。
Proteomics. 2024 Aug;24(15):e2300606. doi: 10.1002/pmic.202300606. Epub 2024 Apr 11.
10
Missing data in the American College of Surgeons National Surgical Quality Improvement Program are not missing at random: implications and potential impact on quality assessments.美国外科医师学会国家手术质量改进计划中的缺失数据并非随机缺失:对质量评估的影响和潜在影响。
J Am Coll Surg. 2010 Feb;210(2):125-139.e2. doi: 10.1016/j.jamcollsurg.2009.10.021.

引用本文的文献

1
Benchmarking Missing Data Imputation Methods for Time Series Using Real-World Test Cases.使用实际测试案例对时间序列的缺失数据插补方法进行基准测试。
Proc Mach Learn Res. 2025 Jun;287:480-501.
2
Integration of sentinel surveillance and climate factors to accelerate malaria elimination in a changing climate of Senegal.整合哨点监测与气候因素以加速在气候变化的塞内加尔消除疟疾
Sci One Health. 2025 May 10;4:100112. doi: 10.1016/j.soh.2025.100112. eCollection 2025.
3
Intraoperative circulation predict prolonged length of stay after head and neck free flap reconstruction: a retrospective study based on machine learning.

本文引用的文献

1
Handling missing data in RCTs; a review of the top medical journals.随机对照试验中缺失数据的处理;顶级医学期刊综述
BMC Med Res Methodol. 2014 Nov 19;14:118. doi: 10.1186/1471-2288-14-118.
2
Missing value imputation in high-dimensional phenomic data: imputable or not, and how?高维表型组数据中的缺失值插补:是否可插补以及如何插补?
BMC Bioinformatics. 2014 Nov 5;15(1):346. doi: 10.1186/s12859-014-0346-6.
3
Robust smoothing of gridded data in one and higher dimensions with missing values.对一维及更高维含缺失值的网格化数据进行稳健平滑处理。
术中循环情况可预测头颈部游离皮瓣重建术后住院时间延长:一项基于机器学习的回顾性研究
Front Oncol. 2025 Jan 10;14:1473447. doi: 10.3389/fonc.2024.1473447. eCollection 2024.
4
Inflammatory burden index: associations between osteoarthritis and all-cause mortality among individuals with osteoarthritis.炎症负担指数:骨关节炎患者中骨关节炎与全因死亡率的关系。
BMC Public Health. 2024 Aug 13;24(1):2203. doi: 10.1186/s12889-024-19632-1.
5
Binned Data Provide Better Imputation of Missing Time Series Data from Wearables.分箱数据可更好地对可穿戴设备中缺失时间序列数据进行插补。
Sensors (Basel). 2023 Jan 28;23(3):1454. doi: 10.3390/s23031454.
6
Machine learning modeling practices to support the principles of AI and ethics in nutrition research.支持营养研究中人工智能和伦理原则的机器学习建模实践。
Nutr Diabetes. 2022 Dec 2;12(1):48. doi: 10.1038/s41387-022-00226-y.
7
Classification of Level of Consciousness in a Neurological ICU Using Physiological Data.使用生理数据对神经重症监护病房患者的意识水平进行分类。
Neurocrit Care. 2023 Feb;38(1):118-128. doi: 10.1007/s12028-022-01586-0. Epub 2022 Sep 15.
8
Comparative assessment and novel strategy on methods for imputing proteomics data.比较评估和蛋白质组学数据插补方法的新策略。
Sci Rep. 2022 Jan 20;12(1):1067. doi: 10.1038/s41598-022-04938-0.
9
Determination of Reactivity Ratios from Binary Copolymerization Using the k-Nearest Neighbor Non-Parametric Regression.使用k近邻非参数回归法从二元共聚反应中测定反应活性比
Polymers (Basel). 2021 Nov 4;13(21):3811. doi: 10.3390/polym13213811.
10
Lagged Correlations among Physiological Variables as Indicators of Consciousness in Stroke Patients.作为中风患者意识指标的生理变量之间的滞后相关性。
AMIA Annu Symp Proc. 2020 Mar 4;2019:942-951. eCollection 2019.
Comput Stat Data Anal. 2010 Apr 1;54(4):1167-1178. doi: 10.1016/j.csda.2009.09.020.
4
Standards should be applied in the prevention and handling of missing data for patient-centered outcomes research: a systematic review and expert consensus.应在以患者为中心的结局研究中应用预防和处理缺失数据的标准:系统评价和专家共识。
J Clin Epidemiol. 2014 Jan;67(1):15-32. doi: 10.1016/j.jclinepi.2013.08.013.
5
Nocturnal continuous glucose and sleep stage data in adults with type 1 diabetes in real-world conditions.1型糖尿病成人患者在实际生活环境中的夜间连续血糖及睡眠阶段数据。
J Diabetes Sci Technol. 2013 Sep 1;7(5):1337-45. doi: 10.1177/193229681300700525.
6
Comparison of methods for handling missing covariate data.缺失协变量数据处理方法的比较。
AAPS J. 2013 Oct;15(4):1232-41. doi: 10.1208/s12248-013-9526-y.
7
Nonconvulsive seizures after subarachnoid hemorrhage: Multimodal detection and outcomes.蛛网膜下腔出血后的非惊厥性发作:多模态检测与结果。
Ann Neurol. 2013 Jul;74(1):53-64. doi: 10.1002/ana.23859. Epub 2013 Jun 27.
8
A classifier ensemble approach for the missing feature problem.分类器集成方法解决缺失特征问题。
Artif Intell Med. 2012 May;55(1):37-50. doi: 10.1016/j.artmed.2011.11.006. Epub 2011 Dec 20.
9
A review of causal inference for biomedical informatics.生物医学信息学因果推断研究综述。
J Biomed Inform. 2011 Dec;44(6):1102-12. doi: 10.1016/j.jbi.2011.07.001. Epub 2011 Jul 14.
10
Multiple imputation using chained equations: Issues and guidance for practice.使用链式方程进行多重插补:实践中的问题和指导。
Stat Med. 2011 Feb 20;30(4):377-99. doi: 10.1002/sim.4067. Epub 2010 Nov 30.