IEEE Trans Pattern Anal Mach Intell. 2016 Sep;38(9):1915-21. doi: 10.1109/TPAMI.2015.2496166. Epub 2015 Oct 29.
Euclidean statistics are often generalized to Riemannian manifolds by replacing straight-line interpolations with geodesic ones. While these Riemannian models are familiar-looking, they are restricted by the inflexibility of geodesics, and they rely on constructions which are optimal only in Euclidean domains. We consider extensions of Principal Component Analysis (PCA) to Riemannian manifolds. Classic Riemannian approaches seek a geodesic curve passing through the mean that optimizes a criteria of interest. The requirements that the solution both is geodesic and must pass through the mean tend to imply that the methods only work well when the manifold is mostly flat within the support of the generating distribution. We argue that instead of generalizing linear Euclidean models, it is more fruitful to generalize non-linear Euclidean models. Specifically, we extend the classic Principal Curves from Hastie & Stuetzle to data residing on a complete Riemannian manifold. We show that for elliptical distributions in the tangent of spaces of constant curvature, the standard principal geodesic is a principal curve. The proposed model is simple to compute and avoids many of the pitfalls of traditional geodesic approaches. We empirically demonstrate the effectiveness of the Riemannian principal curves on several manifolds and datasets.
欧几里得统计学通常通过用测地线替换直线插值来推广到黎曼流形上。虽然这些黎曼模型看起来很熟悉,但它们受到测地线的不灵活性的限制,并且依赖于仅在欧几里得域中最优的构造。我们考虑将主成分分析(PCA)扩展到黎曼流形上。经典的黎曼方法寻求通过均值的测地线曲线,该曲线优化感兴趣的标准。解既是测地线并且必须通过均值的要求往往意味着当流形在生成分布的支持内主要是平坦时,该方法仅能很好地工作。我们认为,与其推广线性欧几里得模型,不如推广非线性欧几里得模型更有成效。具体来说,我们将 Hastie 和 Stuetzle 的经典主曲线扩展到完整的黎曼流形上的数据。我们表明,对于具有恒定曲率的空间切线上的椭圆分布,标准主测地线是主曲线。所提出的模型易于计算,并避免了传统测地线方法的许多陷阱。我们在几个流形和数据集上实验证明了黎曼主曲线的有效性。