Zhan Qipeng, Zhou Zhuoping, Wen Zixuan, Wang Zexuan, Tong Boning, Huang Heng, Saykin Andrew J, Thompson Paul M, Davatzikos Christos, Shen Li
University of Pennsylvania, Philadelphia, PA, USA.
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:634-643. eCollection 2025.
Logistic regression is a widely used model in machine learning, particularly as a baseline for binary classification tasks due to its simplicity, effectiveness, and interpretability. It is especially powerful when dealing with categorical features. Despite its advantages, standard logistic regression fails to capture the distributional and geometric structure of data, especially when features are derived from structured spaces like brain imaging. For instance, in Voxel-Based Morphometry (VBM), measurements from distinct brain regions follow a clear spatial organization, which standard logistic regression cannot fully leverage. In this paper, we propose Sinkhorn Logistic Regression (SLR), a variant of logistic regression that incorporates the Sinkhorn divergence as a loss function. This adaptation enables the model to leverage geometric information about the data distribution, enhancing its performance on structured datasets.
逻辑回归是机器学习中广泛使用的模型,特别是作为二分类任务的基线,因其简单、有效且具有可解释性。在处理分类特征时,它尤其强大。尽管有这些优点,但标准逻辑回归无法捕捉数据的分布和几何结构,特别是当特征来自诸如脑成像等结构化空间时。例如,在基于体素的形态计量学(VBM)中,来自不同脑区的测量遵循清晰的空间组织,而标准逻辑回归无法充分利用这一点。在本文中,我们提出了Sinkhorn逻辑回归(SLR),这是逻辑回归的一种变体,它将Sinkhorn散度作为损失函数。这种调整使模型能够利用关于数据分布的几何信息,提高其在结构化数据集上的性能。