通过深度原型分析学习极值表示。

Learning Extremal Representations with Deep Archetypal Analysis.

作者信息

Keller Sebastian Mathias, Samarin Maxim, Arend Torres Fabricio, Wieser Mario, Roth Volker

机构信息

Department of Mathematics and Computer Science, University of Basel, Spiegelgasse 1, 4051 Basel, Switzerland.

出版信息

Int J Comput Vis. 2021;129(4):805-820. doi: 10.1007/s11263-020-01390-3. Epub 2020 Dec 23.

DOI:10.1007/s11263-020-01390-3

PMID:34720403

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8550171/

Abstract

UNLABELLED

Archetypes represent extreme manifestations of a population with respect to specific characteristic traits or features. In linear feature space, archetypes approximate the data convex hull allowing all data points to be expressed as convex mixtures of archetypes. As mixing of archetypes is performed directly on the input data, linear Archetypal Analysis requires additivity of the input, which is a strong assumption unlikely to hold e.g. in case of image data. To address this problem, we propose learning an appropriate feature space while simultaneously identifying suitable archetypes. We thus introduce a generative formulation of the linear archetype model, parameterized by neural networks. By introducing the distance-dependent archetype loss, the linear archetype model can be integrated into the latent space of a deep variational information bottleneck and an optimal representation, together with the archetypes, can be learned end-to-end. Moreover, the information bottleneck framework allows for a natural incorporation of arbitrarily complex side information during training. As a consequence, learned archetypes become easily interpretable as they derive their meaning directly from the included side information. Applicability of the proposed method is demonstrated by exploring archetypes of female facial expressions while using multi-rater based emotion scores of these expressions as side information. A second application illustrates the exploration of the chemical space of small organic molecules. By using different kinds of side information we demonstrate how identified archetypes, along with their interpretation, largely depend on the side information provided.

SUPPLEMENTARY INFORMATION

The online version contains supplementary material available at 10.1007/s11263-020-01390-3.

摘要

未标注

原型表示在特定特征或特性方面群体的极端表现。在线性特征空间中，原型近似于数据凸包，从而允许所有数据点表示为原型的凸组合。由于原型的混合是直接在输入数据上进行的，线性原型分析要求输入具有可加性，这是一个很强的假设，例如在图像数据的情况下不太可能成立。为了解决这个问题，我们建议学习一个合适的特征空间，同时识别合适的原型。因此，我们引入了由神经网络参数化的线性原型模型的生成式公式。通过引入距离相关的原型损失，线性原型模型可以集成到深度变分信息瓶颈的潜在空间中，并且可以端到端地学习最优表示以及原型。此外，信息瓶颈框架允许在训练期间自然地纳入任意复杂的辅助信息。因此，学习到的原型很容易解释，因为它们直接从包含的辅助信息中获得其含义。通过使用基于多评分者的女性面部表情情感得分作为辅助信息来探索女性面部表情的原型，证明了所提出方法的适用性。第二个应用说明了对小有机分子化学空间的探索。通过使用不同类型的辅助信息，我们展示了所识别的原型及其解释在很大程度上如何依赖于所提供的辅助信息。

补充信息

在线版本包含可在10.1007/s11263-020-01390-3获取的补充材料。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/20c7/8550171/3698caa35e07/11263_2020_1390_Fig1_HTML.jpg

相似文献

Learning Extremal Representations with Deep Archetypal Analysis.通过深度原型分析学习极值表示。

Int J Comput Vis. 2021;129(4):805-820. doi: 10.1007/s11263-020-01390-3. Epub 2020 Dec 23.

Archetypal Analysis for Nominal Observations.原型分析在名义观测中的应用。

IEEE Trans Pattern Anal Mach Intell. 2016 May;38(5):849-61. doi: 10.1109/TPAMI.2015.2470655.

Exploring generative deep learning for omics data using log-linear models.利用对数线性模型探索组学数据的生成式深度学习。

Bioinformatics. 2020 Dec 22;36(20):5045-5053. doi: 10.1093/bioinformatics/btaa623.

DYNAMITE: Integrating Archetypal Analysis and Process Mining for Interpretable Disease Progression Modelling.DYNAMITE：整合原型分析与过程挖掘以进行可解释的疾病进展建模

IEEE J Biomed Health Inform. 2024 Dec;28(12):7553-7564. doi: 10.1109/JBHI.2024.3453602. Epub 2024 Dec 5.

openEHR Archetype Use and Reuse Within Multilingual Clinical Data Sets: Case Study.多语言临床数据集中openEHR原型的使用与复用：案例研究

J Med Internet Res. 2020 Nov 2;22(11):e23361. doi: 10.2196/23361.

CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks.CiwGAN 和 fiwGAN：利用生成对抗网络将声学数据中的信息编码，以建模词汇学习。

Neural Netw. 2021 Jul;139:305-325. doi: 10.1016/j.neunet.2021.03.017. Epub 2021 Mar 19.

Interpretable Machine Learning Models for Molecular Design of Tyrosine Kinase Inhibitors Using Variational Autoencoders and Perturbation-Based Approach of Chemical Space Exploration.基于变分自动编码器和基于扰动的化学空间探索方法的酪氨酸激酶抑制剂分子设计可解释机器学习模型。

Int J Mol Sci. 2022 Sep 24;23(19):11262. doi: 10.3390/ijms231911262.

Interoperability of clinical decision-support systems and electronic health records using archetypes: a case study in clinical trial eligibility.临床决策支持系统和电子健康记录使用原型进行互操作性：临床试验资格案例研究。

J Biomed Inform. 2013 Aug;46(4):676-89. doi: 10.1016/j.jbi.2013.05.004. Epub 2013 May 22.

Network-principled deep generative models for designing drug combinations as graph sets.基于网络原理的深度生成模型，用于将药物组合设计为图集合。

Bioinformatics. 2020 Jul 1;36(Suppl_1):i445-i454. doi: 10.1093/bioinformatics/btaa317.

OWL-based reasoning methods for validating archetypes.基于 OWL 的推理方法用于验证原型。

J Biomed Inform. 2013 Apr;46(2):304-17. doi: 10.1016/j.jbi.2012.11.009. Epub 2012 Dec 14.

引用本文的文献

Archetype analysis and the PHATE algorithm as methods to describe and visualize pregnant women's levels of physical activity knowledge.原型分析和 PHATE 算法作为描述和可视化孕妇身体活动知识水平的方法。

BMC Public Health. 2024 Apr 15;24(1):1054. doi: 10.1186/s12889-024-18355-7.

Machine Learning-Derived Baseline Visual Field Patterns Predict Future Glaucoma Onset in the Ocular Hypertension Treatment Study.机器学习衍生的基线视野模式可预测青光眼在高眼压治疗研究中的发病。

Invest Ophthalmol Vis Sci. 2024 Feb 1;65(2):35. doi: 10.1167/iovs.65.2.35.

Neural ADMIXTURE for rapid genomic clustering.用于快速基因组聚类的神经混合模型

Nat Comput Sci. 2023 Jul;3(7):621-629. doi: 10.1038/s43588-023-00482-7. Epub 2023 Jul 6.

Use of artificial intelligence in forecasting glaucoma progression.人工智能在预测青光眼进展中的应用。

Taiwan J Ophthalmol. 2023 May 23;13(2):168-183. doi: 10.4103/tjo.TJO-D-23-00022. eCollection 2023 Apr-Jun.

Non-linear archetypal analysis of single-cell RNA-seq data by deep autoencoders.基于深度自动编码器的单细胞 RNA-seq 数据的非线性原型分析。

PLoS Comput Biol. 2022 Apr 1;18(4):e1010025. doi: 10.1371/journal.pcbi.1010025. eCollection 2022 Apr.

Archetypal Analysis and DEA Model, Their Application on Financial Data and Visualization with PHATE.原型分析与数据包络分析模型及其在金融数据中的应用与基于PHATE的可视化

Entropy (Basel). 2022 Jan 5;24(1):88. doi: 10.3390/e24010088.

本文引用的文献

Information Bottleneck for Estimating Treatment Effects with Systematically Missing Covariates.用于估计存在系统性缺失协变量时治疗效果的信息瓶颈

Entropy (Basel). 2020 Mar 29;22(4):389. doi: 10.3390/e22040389.

Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules.使用数据驱动的分子连续表示法进行自动化学设计。

ACS Cent Sci. 2018 Feb 28;4(2):268-276. doi: 10.1021/acscentsci.7b00572. Epub 2018 Jan 12.

Virtual Exploration of the Ring Systems Chemical Universe.环状系统化学宇宙的虚拟探索

J Chem Inf Model. 2017 Nov 27;57(11):2707-2718. doi: 10.1021/acs.jcim.7b00457. Epub 2017 Oct 30.

Quantum chemistry structures and properties of 134 kilo molecules.134 千克分子的量子化学结构和性质。

Sci Data. 2014 Aug 5;1:140022. doi: 10.1038/sdata.2014.22. eCollection 2014.

Evolutionary tradeoffs, Pareto optimality and the morphology of ammonite shells.进化权衡、帕累托最优与菊石壳的形态

BMC Syst Biol. 2015 Mar 7;9:12. doi: 10.1186/s12918-015-0149-z.

Trade-offs.权衡取舍

Curr Biol. 2014 Jan 20;24(2):R60-R61. doi: 10.1016/j.cub.2013.11.036.

Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17.化学宇宙数据库 GDB-17 中 1660 亿个有机小分子的枚举。

J Chem Inf Model. 2012 Nov 26;52(11):2864-75. doi: 10.1021/ci300415d. Epub 2012 Nov 1.

Multidimensional optimality of microbial metabolism.微生物代谢的多维最优性。

Science. 2012 May 4;336(6081):601-4. doi: 10.1126/science.1216882.

Evolutionary trade-offs, Pareto optimality, and the geometry of phenotype space.进化权衡、帕累托最优和表型空间的几何形状。

Science. 2012 Jun 1;336(6085):1157-60. doi: 10.1126/science.1217405. Epub 2012 Apr 26.

970 million druglike small molecules for virtual screening in the chemical universe database GDB-13.化学宇宙数据库GDB - 13中用于虚拟筛选的9.7亿个类药小分子。

J Am Chem Soc. 2009 Jul 1;131(25):8732-3. doi: 10.1021/ja902302h.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

通过深度原型分析学习极值表示。

Learning Extremal Representations with Deep Archetypal Analysis.

作者信息

机构信息

出版信息

UNLABELLED

SUPPLEMENTARY INFORMATION

未标注

补充信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献