Suppr超能文献

为医学应用训练无混杂因素的深度学习模型。

Training confounder-free deep learning models for medical applications.

机构信息

Department of Psychiatry & Behavioral Sciences, Stanford University, Stanford, CA, 94305, USA.

Department of Computer Science, Stanford University, Stanford, CA, 94305, USA.

出版信息

Nat Commun. 2020 Nov 26;11(1):6010. doi: 10.1038/s41467-020-19784-9.

Abstract

The presence of confounding effects (or biases) is one of the most critical challenges in using deep learning to advance discovery in medical imaging studies. Confounders affect the relationship between input data (e.g., brain MRIs) and output variables (e.g., diagnosis). Improper modeling of those relationships often results in spurious and biased associations. Traditional machine learning and statistical models minimize the impact of confounders by, for example, matching data sets, stratifying data, or residualizing imaging measurements. Alternative strategies are needed for state-of-the-art deep learning models that use end-to-end training to automatically extract informative features from large set of images. In this article, we introduce an end-to-end approach for deriving features invariant to confounding factors while accounting for intrinsic correlations between the confounder(s) and prediction outcome. The method does so by exploiting concepts from traditional statistical methods and recent fair machine learning schemes. We evaluate the method on predicting the diagnosis of HIV solely from Magnetic Resonance Images (MRIs), identifying morphological sex differences in adolescence from those of the National Consortium on Alcohol and Neurodevelopment in Adolescence (NCANDA), and determining the bone age from X-ray images of children. The results show that our method can accurately predict while reducing biases associated with confounders. The code is available at https://github.com/qingyuzhao/br-net .

摘要

混杂效应(或偏差)的存在是使用深度学习在医学影像研究中推进发现的最关键挑战之一。混杂因素会影响输入数据(例如大脑 MRI)和输出变量(例如诊断)之间的关系。这些关系的建模不当通常会导致虚假和有偏差的关联。传统的机器学习和统计模型通过例如匹配数据集、分层数据或残差化成像测量来最小化混杂因素的影响。对于使用端到端训练从大量图像中自动提取信息特征的最先进的深度学习模型,需要替代策略。在本文中,我们引入了一种端到端的方法,该方法可以在考虑混杂因素与预测结果之间的内在相关性的同时,使特征不受混杂因素的影响。该方法通过利用传统统计方法和最近的公平机器学习方案中的概念来实现这一点。我们在仅从磁共振成像 (MRI) 预测 HIV 诊断、从国家酒精和神经发育青少年联盟 (NCANDA) 的青少年中识别形态性别差异以及从儿童 X 射线图像确定骨龄方面评估了该方法。结果表明,我们的方法可以在减少与混杂因素相关的偏差的同时准确预测。代码可在 https://github.com/qingyuzhao/br-net 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eaa8/7691500/2c5fdc8f7f47/41467_2020_19784_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验