Naval Postgraduate School, 1411 Cunningham Road, Monterey, CA, 93943-5219, USA.
University of California, Berkeley, 1042 Evans Hall, Berkeley, CA, 94720-3840, USA.
Bull Math Biol. 2019 Feb;81(2):568-597. doi: 10.1007/s11538-018-0493-4. Epub 2018 Sep 11.
Principal component analysis is a widely used method for the dimensionality reduction of a given data set in a high-dimensional Euclidean space. Here we define and analyze two analogues of principal component analysis in the setting of tropical geometry. In one approach, we study the Stiefel tropical linear space of fixed dimension closest to the data points in the tropical projective torus; in the other approach, we consider the tropical polytope with a fixed number of vertices closest to the data points. We then give approximative algorithms for both approaches and apply them to phylogenetics, testing the methods on simulated phylogenetic data and on an empirical dataset of Apicomplexa genomes.
主成分分析是一种广泛应用于高维欧几里得空间中给定数据集降维的方法。在这里,我们在热带几何的背景下定义和分析了主成分分析的两个类似物。在一种方法中,我们研究了与热带射影环面上的数据点最近的固定维数的斯蒂菲尔热带线性空间;在另一种方法中,我们考虑了与数据点最近的固定顶点数的热带多胞形。然后,我们给出了这两种方法的近似算法,并将其应用于系统发生学,在模拟的系统发生数据和节肢动物基因组的实际数据集上测试这些方法。