在基于连接组的预测建模中挽救缺失数据。

Rescuing missing data in connectome-based predictive modeling.

作者信息

Liang Qinghao, Jiang Rongtao, Adkinson Brendan D, Rosenblatt Matthew, Mehta Saloni, Foster Maya L, Dong Siyuan, You Chenyu, Negahban Sahand, Zhou Harrison H, Chang Joseph, Scheinost Dustin

机构信息

Department of Biomedical Engineering, Yale University, New Haven, CT, United States.

Department of Radiology & Biomedical Imaging, Yale School of Medicine, New Haven, CT, United States.

出版信息

Imaging Neurosci (Camb). 2024 Feb 2;2. doi: 10.1162/imag_a_00071. eCollection 2024.

DOI:10.1162/imag_a_00071

PMID:40800425

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12224408/

Abstract

Recent evidence suggests brain-phenotype predictions may require very large sample sizes. However, as the sample size increases, missing data also increase. Conventional methods, like complete-case analysis, discard useful information and shrink the sample size. To address the missing data problem, we investigated rescuing these missing data through imputation. Imputation is substituting estimated values for missing data to be used in downstream analyses. We integrated imputation methods into the Connectome-based Predictive Modeling (CPM) framework. Utilizing four open-source datasets-the Human Connectome Project, the Philadelphia Neurodevelopmental Cohort, the UCLA Consortium for Neuropsychiatric Phenomics, and the Healthy Brain Network (HBN)-we validated and compared our framework with different imputation methods against complete-case analysis for both missing connectomes and missing phenotypic measures scenarios. Imputing connectomes exhibited superior prediction performance on real and simulated missing data compared to complete-case analysis. In addition, we found that imputation accuracy was a good indicator for choosing an imputation method for missing phenotypic measures but not informative for missing connectomes. In a real-world example predicting cognition using the HBN, we rescued 628 individuals through imputation, doubling the complete case sample size and increasing the variance explained by the predicted value by 45%. In conclusion, our study is a benchmark for state-of-the-art imputation techniques when dealing with missing connectome and phenotypic data in predictive modeling scenarios. Our results suggest that improving prediction performance can be achieved by strategically addressing missing data through effective imputation methods rather than resorting to the outright exclusion of participants. Our results suggest that rescuing data with imputation, instead of discarding participants with missing information, improves prediction performance.

摘要

最近的证据表明，脑表型预测可能需要非常大的样本量。然而，随着样本量的增加，缺失数据也会增加。传统方法，如完整病例分析，会丢弃有用信息并缩小样本量。为了解决缺失数据问题，我们研究了通过插补来挽救这些缺失数据。插补是用估计值替代缺失数据，以便在下游分析中使用。我们将插补方法集成到基于连接组的预测建模（CPM）框架中。利用四个开源数据集——人类连接组计划、费城神经发育队列、加州大学洛杉矶分校神经精神疾病表型组学联盟和健康大脑网络（HBN）——我们针对连接组缺失和表型测量缺失的情况，将我们的框架与不同的插补方法进行了验证和比较，并与完整病例分析进行了对比。与完整病例分析相比，插补连接组在真实和模拟的缺失数据上表现出了卓越的预测性能。此外，我们发现插补准确性是选择用于缺失表型测量的插补方法的良好指标，但对于缺失连接组则没有参考价值。在一个使用HBN预测认知的实际例子中，我们通过插补挽救了628名个体，使完整病例样本量增加了一倍，并使预测值解释的方差增加了45%。总之，我们的研究是预测建模场景中处理缺失连接组和表型数据时最先进插补技术的一个基准。我们的结果表明，通过有效的插补方法策略性地处理缺失数据，而不是直接排除参与者，可以提高预测性能。我们的结果表明，用插补来挽救数据，而不是丢弃有缺失信息的参与者，能够提高预测性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7764/12224408/4e5ae22ee1a4/imag_a_00071_fig1.jpg

相似文献

Rescuing missing data in connectome-based predictive modeling.在基于连接组的预测建模中挽救缺失数据。

Imaging Neurosci (Camb). 2024 Feb 2;2. doi: 10.1162/imag_a_00071. eCollection 2024.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Does the Presence of Missing Data Affect the Performance of the SORG Machine-learning Algorithm for Patients With Spinal Metastasis? Development of an Internet Application Algorithm.缺失数据的存在是否会影响 SORG 机器学习算法在脊柱转移瘤患者中的性能？开发一种互联网应用算法。

Clin Orthop Relat Res. 2024 Jan 1;482(1):143-157. doi: 10.1097/CORR.0000000000002706. Epub 2023 Jun 12.

Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗？

Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.

Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。

Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.

Diagnostic test accuracy and cost-effectiveness of tests for codeletion of chromosomal arms 1p and 19q in people with glioma.染色体臂 1p 和 19q 缺失的检测在胶质瘤患者中的诊断准确性和成本效益。

Cochrane Database Syst Rev. 2022 Mar 2;3(3):CD013387. doi: 10.1002/14651858.CD013387.pub2.

Antiemetics for adults for prevention of nausea and vomiting caused by moderately or highly emetogenic chemotherapy: a network meta-analysis.成人止吐药预防中度或高度致吐性化疗引起的恶心和呕吐：网状荟萃分析。

Cochrane Database Syst Rev. 2021 Nov 16;11(11):CD012775. doi: 10.1002/14651858.CD012775.pub2.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中，如果患者出现以下症状和体征，可判断其是否患有 COVID-19。

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Omega-3 fatty acids for depression in adults.成人抑郁症的ω-3脂肪酸治疗

Cochrane Database Syst Rev. 2015 Nov 5;2015(11):CD004692. doi: 10.1002/14651858.CD004692.pub4.

Generative adversarial networks for imputing missing data for big data clinical research.生成对抗网络在大数据临床研究中用于填补缺失数据。

BMC Med Res Methodol. 2021 Apr 20;21(1):78. doi: 10.1186/s12874-021-01272-3.

引用本文的文献

Gray matter microstructure from in-vivo diffusion MRI reflects post-mortem neuropathology severity and clinical progression of Alzheimer's disease.来自活体扩散磁共振成像的灰质微观结构反映了阿尔茨海默病的死后神经病理学严重程度和临床进展。

medRxiv. 2025 Jun 4:2025.05.30.25328630. doi: 10.1101/2025.05.30.25328630.

本文引用的文献

Replicable brain-phenotype associations require large-scale neuroimaging data.可复制的大脑-表型关联需要大规模的神经影像学数据。

Nat Hum Behav. 2023 Aug;7(8):1344-1356. doi: 10.1038/s41562-023-01642-5. Epub 2023 Jun 26.

Transdiagnostic Connectome-Based Prediction of Craving.基于连接组学的跨诊断预测渴求。

Am J Psychiatry. 2023 Jun 1;180(6):445-453. doi: 10.1176/appi.ajp.21121207. Epub 2023 Mar 29.

Interpreting Brain Biomarkers: Challenges and solutions in interpreting machine learning-based predictive neuroimaging.解读脑生物标志物：解读基于机器学习的预测性神经影像学中的挑战与解决方案

IEEE Signal Process Mag. 2022 Jul;39(4):107-118. doi: 10.1109/MSP.2022.3155951. Epub 2022 Jun 28.

Comparison of individualized behavioral predictions across anatomical, diffusion and functional connectivity MRI.比较解剖、弥散和功能连接 MRI 的个体化行为预测。

Neuroimage. 2022 Nov;263:119636. doi: 10.1016/j.neuroimage.2022.119636. Epub 2022 Sep 16.

A Neuroimaging Signature of Cognitive Aging from Whole-Brain Functional Connectivity.全脑功能连接的认知老化神经影像学特征。

Adv Sci (Weinh). 2022 Aug;9(24):e2201621. doi: 10.1002/advs.202201621. Epub 2022 Jul 10.

Predicting the future of neuroimaging predictive models in mental health.预测精神健康领域神经影像学预测模型的未来。

Mol Psychiatry. 2022 Aug;27(8):3129-3137. doi: 10.1038/s41380-022-01635-2. Epub 2022 Jun 13.

Meta-matching as a simple framework to translate phenotypic predictive models from big to small data.元匹配作为一个简单的框架，用于将表型预测模型从大数据转化为小数据。

Nat Neurosci. 2022 Jun;25(6):795-804. doi: 10.1038/s41593-022-01059-9. Epub 2022 May 16.

Benchmarking missing-values approaches for predictive models on health databases.健康数据库中预测模型缺失值处理方法的基准测试

Gigascience. 2022 Apr 15;11. doi: 10.1093/gigascience/giac013.

Linking interindividual variability in brain structure to behaviour.将大脑结构的个体间差异与行为联系起来。

Nat Rev Neurosci. 2022 May;23(5):307-318. doi: 10.1038/s41583-022-00584-7. Epub 2022 Apr 1.

Reproducible brain-wide association studies require thousands of individuals.可复制的全脑关联研究需要数千人参与。

Nature. 2022 Mar;603(7902):654-660. doi: 10.1038/s41586-022-04492-9. Epub 2022 Mar 16.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在基于连接组的预测建模中挽救缺失数据。

Rescuing missing data in connectome-based predictive modeling.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献