Suppr超能文献

多视图分析的协同学习。

Cooperative learning for multiview analysis.

机构信息

Department of Biomedical Data Science, Stanford University, Stanford, CA 94305.

Department of Statistics, Stanford University, Stanford, CA 94305.

出版信息

Proc Natl Acad Sci U S A. 2022 Sep 20;119(38):e2202113119. doi: 10.1073/pnas.2202113119. Epub 2022 Sep 12.

Abstract

We propose a method for supervised learning with multiple sets of features ("views"). The multiview problem is especially important in biology and medicine, where "-omics" data, such as genomics, proteomics, and radiomics, are measured on a common set of samples. "Cooperative learning" combines the usual squared-error loss of predictions with an "agreement" penalty to encourage the predictions from different data views to agree. By varying the weight of the agreement penalty, we get a continuum of solutions that include the well-known early and late fusion approaches. Cooperative learning chooses the degree of agreement (or fusion) in an adaptive manner, using a validation set or cross-validation to estimate test set prediction error. One version of our fitting procedure is modular, where one can choose different fitting mechanisms (e.g., lasso, random forests, boosting, or neural networks) appropriate for different data views. In the setting of cooperative regularized linear regression, the method combines the lasso penalty with the agreement penalty, yielding feature sparsity. The method can be especially powerful when the different data views share some underlying relationship in their signals that can be exploited to boost the signals. We show that cooperative learning achieves higher predictive accuracy on simulated data and real multiomics examples of labor-onset prediction. By leveraging aligned signals and allowing flexible fitting mechanisms for different modalities, cooperative learning offers a powerful approach to multiomics data fusion.

摘要

我们提出了一种用于多组特征(“视图”)的监督学习方法。在生物学和医学中,多视图问题尤为重要,因为基因组学、蛋白质组学和放射组学等“组学”数据是在一组共同的样本上测量的。“协同学习”将预测的常用平方误差损失与“一致性”惩罚相结合,以鼓励来自不同数据视图的预测一致。通过改变一致性惩罚的权重,我们得到了一个连续的解决方案,其中包括众所周知的早期和晚期融合方法。协同学习以自适应的方式选择一致性(或融合)的程度,使用验证集或交叉验证来估计测试集的预测误差。我们的拟合过程的一个版本是模块化的,其中可以为不同的数据视图选择不同的拟合机制(例如,lasso、随机森林、boosting 或神经网络)。在协同正则化线性回归的设置中,该方法将 lasso 惩罚与一致性惩罚相结合,产生特征稀疏性。当不同的数据视图在其信号中共享一些可以利用来增强信号的潜在关系时,该方法尤其强大。我们表明,协同学习在模拟数据和实际的分娩预测多组学示例上实现了更高的预测准确性。通过利用对齐的信号并为不同的模式提供灵活的拟合机制,协同学习为多组学数据融合提供了一种强大的方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d80d/9499553/96a68411c006/pnas.2202113119fig01.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验