Department of Computational Molecular Biology, Max-Planck-Institute for Molecular Genetics, Berlin, Germany.
J Am Med Inform Assoc. 2012 Mar-Apr;19(2):255-62. doi: 10.1136/amiajnl-2010-000004. Epub 2011 Aug 28.
Renal transplantation has dramatically improved the survival rate of hemodialysis patients. However, with a growing proportion of marginal organs and improved immunosuppression, it is necessary to verify that the established allocation system, mostly based on human leukocyte antigen matching, still meets today's needs. The authors turn to machine-learning techniques to predict, from donor-recipient data, the estimated glomerular filtration rate (eGFR) of the recipient 1 year after transplantation.
The patient's eGFR was predicted using donor-recipient characteristics available at the time of transplantation. Donors' data were obtained from Eurotransplant's database, while recipients' details were retrieved from Charité Campus Virchow-Klinikum's database. A total of 707 renal transplantations from cadaveric donors were included.
Two separate datasets were created, taking features with <10% missing values for one and <50% missing values for the other. Four established regressors were run on both datasets, with and without feature selection.
The authors obtained a Pearson correlation coefficient between predicted and real eGFR (COR) of 0.48. The best model for the dataset was a Gaussian support vector machine with recursive feature elimination on the more inclusive dataset. All results are available at http://transplant.molgen.mpg.de/.
For now, missing values in the data must be predicted and filled in. The performance is not as high as hoped, but the dataset seems to be the main cause.
Predicting the outcome is possible with the dataset at hand (COR=0.48). Valuable features include age and creatinine levels of the donor, as well as sex and weight of the recipient.
肾移植显著提高了血液透析患者的存活率。然而,随着边缘器官比例的增加和免疫抑制的改善,有必要验证主要基于人类白细胞抗原匹配的现有分配系统是否仍然符合当今的需求。作者转向机器学习技术,从供体-受者数据中预测受者移植后 1 年的估计肾小球滤过率(eGFR)。
使用移植时供体-受者特征预测患者的 eGFR。供体数据来自 Eurotransplant 数据库,而受者详细信息从 Charité Campus Virchow-Klinikum 数据库中检索。共纳入 707 例尸体供肾移植。
创建了两个独立的数据集,一个数据集的特征缺失值<10%,另一个数据集的特征缺失值<50%。在两个数据集上运行了四个已建立的回归器,包括有无特征选择。
作者获得了预测和真实 eGFR 之间的皮尔逊相关系数(COR)为 0.48。对于更具包容性的数据集,具有递归特征消除的高斯支持向量机是数据集的最佳模型。所有结果均可在 http://transplant.molgen.mpg.de/ 获得。
目前,数据中的缺失值必须进行预测和填补。性能不如预期的高,但数据集似乎是主要原因。
使用手头的数据集进行预测是可行的(COR=0.48)。有价值的特征包括供体的年龄和肌酐水平,以及受者的性别和体重。