Ou Huiting, Surendra Anuradha, McDowell Graeme S V, Hashimoto-Roth Emily, Xia Jianguo, Bennett Steffany A L, Čuperlović-Culf Miroslava
Department of Human Genetics, McGill University, Montreal, QC H3A 0C7, Canada.
Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto 606-8507, Japan.
Bioinform Adv. 2025 Jan 21;5(1):vbae209. doi: 10.1093/bioadv/vbae209. eCollection 2025.
Missing values are prevalent in high-throughput measurements due to various experimental or analytical reasons. Imputation, the process of replacing missing values in a dataset with estimated values, plays an important role in multivariate and machine learning analyses. The three missingness patterns, including missing completely at random, missing at random, and missing not at random, describe unique dependencies between the missing and observed data. The optimal imputation method for each dataset depends on the type of data, the cause of the missingness, and the nature of relationships between the missing and observed data. The challenge is to identify the optimal imputation solution for a given dataset.
ImpLiMet: is a user-friendly web-platform that enables users to impute missing data using eight different methods. For a given dataset, ImpLiMet suggests the optimal imputation solution through a grid search-based investigation of the error rate for imputation across three missingness data simulations. The effect of imputation can be visually assessed by histogram, kurtosis, and skewness, as well as principal component analysis comparing the impact of the chosen imputation method on the distribution and overall behavior of the data.
ImpLiMet is freely available at https://complimet.ca/shiny/implimet/ and https://github.com/complimet/ImpLiMet.
由于各种实验或分析原因,高通量测量中普遍存在缺失值。插补是用估计值替换数据集中缺失值的过程,在多变量和机器学习分析中起着重要作用。三种缺失模式,包括完全随机缺失、随机缺失和非随机缺失,描述了缺失数据与观测数据之间独特的依赖关系。每个数据集的最佳插补方法取决于数据类型、缺失原因以及缺失数据与观测数据之间关系的性质。挑战在于为给定数据集确定最佳插补解决方案。
ImpLiMet是一个用户友好的网络平台,它使用户能够使用八种不同方法插补缺失数据。对于给定数据集,ImpLiMet通过基于网格搜索的对三种缺失数据模拟的插补错误率调查,建议最佳插补解决方案。插补效果可以通过直方图、峰度和偏度直观评估,也可以通过主成分分析来比较所选插补方法对数据分布和整体行为的影响。
ImpLiMet可在https://complimet.ca/shiny/implimet/和https://github.com/complimet/ImpLiMet上免费获取。