Fino Nora F, Inker Lesley A, Greene Tom, Adingwupu Ogechi M, Coresh Josef, Seegmiller Jesse, Shlipak Michael G, Jafar Tazeen H, Kalil Roberto, Costa E Silva Veronica T, Gudnason Vilmundur, Levey Andrew S, Haaland Ben
Division of Biostatistics, Department of Population Health Sciences, University of Utah Health, Salt Lake City, Utah, United States of America.
Division of Nephrology, Department of Medicine, Tufts Medical Center, Boston, Massachusetts, United States of America.
PLoS One. 2024 Dec 2;19(12):e0313154. doi: 10.1371/journal.pone.0313154. eCollection 2024.
Assessing glomerular filtration rate (GFR) is critical for diagnosis, staging, and management of kidney disease. However, accuracy of estimated GFR (eGFR) is limited by large errors (>30% error present in >10-50% of patients), adversely impacting patient care. Errors often result from variation across populations of non-GFR determinants affecting the filtration markers used to estimate GFR. We hypothesized that combining multiple filtration markers with non-overlapping non-GFR determinants into a panel GFR could improve eGFR accuracy, extending current recognition that adding cystatin C to serum creatinine improves accuracy. Non-GFR determinants of markers can affect the accuracy of eGFR in two ways: first, increased variability in the non-GFR determinants of some filtration markers among application populations compared to the development population may result in outlying values for those markers. Second, systematic differences in the non-GFR determinants of some markers between application and development populations can lead to biased estimates in the application populations. Here, we propose and evaluate methods for estimating GFR based on multiple markers in applications with potentially higher rates of outlying predictors than in development data. We apply transfer learning to address systematic differences between application and development populations. We evaluated a panel of 8 markers (5 metabolites and 3 low molecular weight proteins) in 3,554 participants from 9 studies. Results show that contamination in two strongly predictive markers can increase imprecision by more than two-fold, but outlier identification with robust estimation can restore precision nearly fully to uncontaminated data. Furthermore, transfer learning can yield similar results with even modest training set sample size. Combining both approaches addresses both sources of error in GFR estimates. Once the laboratory challenge of developing a validated targeted assay for additional metabolites is overcome, these methods can inform the use of a panel eGFR across diverse clinical settings, ensuring accuracy despite differing non-GFR determinants.
评估肾小球滤过率(GFR)对于肾脏疾病的诊断、分期和管理至关重要。然而,估算肾小球滤过率(eGFR)的准确性受到较大误差的限制(超过10%-50%的患者存在>30%的误差),对患者护理产生不利影响。误差通常源于影响用于估算GFR的滤过标志物的非GFR决定因素在不同人群中的差异。我们假设将多个滤过标志物与不重叠的非GFR决定因素组合成一个综合GFR可以提高eGFR的准确性,这扩展了目前关于在血清肌酐中添加胱抑素C可提高准确性的认识。标志物的非GFR决定因素可通过两种方式影响eGFR的准确性:首先,与开发人群相比,某些滤过标志物的非GFR决定因素在应用人群中的变异性增加可能导致这些标志物出现异常值。其次,应用人群和开发人群之间某些标志物的非GFR决定因素的系统差异可导致应用人群中的估计出现偏差。在此,我们提出并评估了在可能比开发数据中异常预测因子发生率更高的应用中基于多个标志物估算GFR的方法。我们应用迁移学习来解决应用人群和开发人群之间的系统差异。我们在来自9项研究的3554名参与者中评估了一组8种标志物(5种代谢物和3种低分子量蛋白质)。结果表明,两种强预测性标志物中的污染可使不精确性增加两倍以上,但通过稳健估计进行异常值识别可使精度几乎完全恢复到未受污染的数据。此外,即使训练集样本量适度,迁移学习也能产生相似的结果。将这两种方法结合可解决GFR估计中的两种误差来源。一旦克服了开发针对其他代谢物的经过验证的靶向检测方法的实验室挑战,这些方法可指导在不同临床环境中使用综合eGFR,确保尽管非GFR决定因素不同但仍具有准确性。