Suppr超能文献

在基于连接组的预测建模中挽救缺失数据。

Rescuing missing data in connectome-based predictive modeling.

作者信息

Liang Qinghao, Jiang Rongtao, Adkinson Brendan D, Rosenblatt Matthew, Mehta Saloni, Foster Maya L, Dong Siyuan, You Chenyu, Negahban Sahand, Zhou Harrison H, Chang Joseph, Scheinost Dustin

机构信息

Department of Biomedical Engineering, Yale University, New Haven, CT, United States.

Department of Radiology & Biomedical Imaging, Yale School of Medicine, New Haven, CT, United States.

出版信息

Imaging Neurosci (Camb). 2024 Feb 2;2. doi: 10.1162/imag_a_00071. eCollection 2024.

Abstract

Recent evidence suggests brain-phenotype predictions may require very large sample sizes. However, as the sample size increases, missing data also increase. Conventional methods, like complete-case analysis, discard useful information and shrink the sample size. To address the missing data problem, we investigated rescuing these missing data through imputation. Imputation is substituting estimated values for missing data to be used in downstream analyses. We integrated imputation methods into the Connectome-based Predictive Modeling (CPM) framework. Utilizing four open-source datasets-the Human Connectome Project, the Philadelphia Neurodevelopmental Cohort, the UCLA Consortium for Neuropsychiatric Phenomics, and the Healthy Brain Network (HBN)-we validated and compared our framework with different imputation methods against complete-case analysis for both missing connectomes and missing phenotypic measures scenarios. Imputing connectomes exhibited superior prediction performance on real and simulated missing data compared to complete-case analysis. In addition, we found that imputation accuracy was a good indicator for choosing an imputation method for missing phenotypic measures but not informative for missing connectomes. In a real-world example predicting cognition using the HBN, we rescued 628 individuals through imputation, doubling the complete case sample size and increasing the variance explained by the predicted value by 45%. In conclusion, our study is a benchmark for state-of-the-art imputation techniques when dealing with missing connectome and phenotypic data in predictive modeling scenarios. Our results suggest that improving prediction performance can be achieved by strategically addressing missing data through effective imputation methods rather than resorting to the outright exclusion of participants. Our results suggest that rescuing data with imputation, instead of discarding participants with missing information, improves prediction performance.

摘要

最近的证据表明,脑表型预测可能需要非常大的样本量。然而,随着样本量的增加,缺失数据也会增加。传统方法,如完整病例分析,会丢弃有用信息并缩小样本量。为了解决缺失数据问题,我们研究了通过插补来挽救这些缺失数据。插补是用估计值替代缺失数据,以便在下游分析中使用。我们将插补方法集成到基于连接组的预测建模(CPM)框架中。利用四个开源数据集——人类连接组计划、费城神经发育队列、加州大学洛杉矶分校神经精神疾病表型组学联盟和健康大脑网络(HBN)——我们针对连接组缺失和表型测量缺失的情况,将我们的框架与不同的插补方法进行了验证和比较,并与完整病例分析进行了对比。与完整病例分析相比,插补连接组在真实和模拟的缺失数据上表现出了卓越的预测性能。此外,我们发现插补准确性是选择用于缺失表型测量的插补方法的良好指标,但对于缺失连接组则没有参考价值。在一个使用HBN预测认知的实际例子中,我们通过插补挽救了628名个体,使完整病例样本量增加了一倍,并使预测值解释的方差增加了45%。总之,我们的研究是预测建模场景中处理缺失连接组和表型数据时最先进插补技术的一个基准。我们的结果表明,通过有效的插补方法策略性地处理缺失数据,而不是直接排除参与者,可以提高预测性能。我们的结果表明,用插补来挽救数据,而不是丢弃有缺失信息的参与者,能够提高预测性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7764/12224408/4e5ae22ee1a4/imag_a_00071_fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验