基于约束样本和特征选择的逻辑回归。

Logistic Regression Confined by Cardinality-Constrained Sample and Feature Selection.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2020 Jul;42(7):1713-1728. doi: 10.1109/TPAMI.2019.2901688. Epub 2019 Feb 26.

DOI:10.1109/TPAMI.2019.2901688

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7331794/

Abstract

Many vision-based applications rely on logistic regression for embedding classification within a probabilistic context, such as recognition in images and videos or identifying disease-specific image phenotypes from neuroimages. Logistic regression, however, often performs poorly when trained on data that is noisy, has irrelevant features, or when the samples are distributed across the classes in an imbalanced setting; a common occurrence in visual recognition tasks. To deal with those issues, researchers generally rely on ad-hoc regularization techniques or model a subset of these issues. We instead propose a mathematically sound logistic regression model that selects a subset of (relevant) features and (informative and balanced) set of samples during the training process. The model does so by applying cardinality constraints (via l-'norm' sparsity) on the features and samples. l defines sparsity in mathematical settings but in practice has mostly been approximated (e.g., via l or its variations) for computational simplicity. We prove that a local minimum to the non-convex optimization problems induced by cardinality constraints can be computed by combining block coordinate descent with penalty decomposition. On synthetic, image recognition, and neuroimaging datasets, we show that the accuracy of the method is higher than alternative methods and classifiers commonly used in the literature.

摘要

许多基于视觉的应用程序都依赖逻辑回归来在概率环境中嵌入分类，例如在图像和视频中进行识别，或从神经影像中识别特定疾病的图像表型。然而，当在噪声数据、不相关特征或样本在不平衡设置中分布在各个类别上的情况下进行训练时，逻辑回归的性能往往不佳；这种情况在视觉识别任务中很常见。为了解决这些问题，研究人员通常依赖于特定的正则化技术或仅对这些问题的一部分进行建模。相比之下，我们提出了一种数学上合理的逻辑回归模型，该模型可以在训练过程中选择特征和样本的子集。该模型通过对特征和样本应用基数约束（通过 l-范数稀疏性）来实现这一点。l 在数学环境中定义了稀疏性，但在实践中，为了计算简便，主要是通过 l 或其变体来近似。我们证明了通过结合块坐标下降和惩罚分解，可以计算出由基数约束引起的非凸优化问题的局部极小值。在合成数据集、图像识别数据集和神经影像数据集上的实验表明，与文献中常用的替代方法和分类器相比，该方法的准确性更高。

相似文献

Logistic Regression Confined by Cardinality-Constrained Sample and Feature Selection.

IEEE Trans Pattern Anal Mach Intell. 2020 Jul;42(7):1713-1728. doi: 10.1109/TPAMI.2019.2901688. Epub 2019 Feb 26.

Computing group cardinality constraint solutions for logistic regression problems.

Med Image Anal. 2017 Jan;35:58-69. doi: 10.1016/j.media.2016.05.011. Epub 2016 Jun 11.

A Classification Algorithm by Combination of Feature Decomposition and Kernel Discriminant Analysis (KDA) for Automatic MR Brain Image Classification and AD Diagnosis.

Comput Math Methods Med. 2019 Dec 30;2019:1437123. doi: 10.1155/2019/1437123. eCollection 2019.

Joint Data Harmonization and Group Cardinality Constrained Classification.

Med Image Comput Comput Assist Interv. 2016 Oct;9900:282-290. doi: 10.1007/978-3-319-46720-7_33. Epub 2016 Oct 2.

Solving Logistic Regression with Group Cardinality Constraints for Time Series Analysis.

Med Image Comput Comput Assist Interv. 2015 Oct;9351:459-466. doi: 10.1007/978-3-319-24574-4_55. Epub 2015 Nov 18.

Sparse logistic regression with a L1/2 penalty for gene selection in cancer classification.

BMC Bioinformatics. 2013 Jun 19;14:198. doi: 10.1186/1471-2105-14-198.

Locally linear transform based three-dimensional gradient -norm minimization for spectral CT reconstruction.

Med Phys. 2020 Oct;47(10):4810-4826. doi: 10.1002/mp.14420. Epub 2020 Aug 25.

Pattern Discovery in Brain Imaging Genetics via SCCA Modeling with a Generic Non-convex Penalty.

Sci Rep. 2017 Oct 25;7(1):14052. doi: 10.1038/s41598-017-13930-y.

Chained regularization for identifying brain patterns specific to HIV infection.

Neuroimage. 2018 Dec;183:425-437. doi: 10.1016/j.neuroimage.2018.08.022. Epub 2018 Aug 21.

Reproducible evaluation of classification methods in Alzheimer's disease: Framework and application to MRI and PET data.

Neuroimage. 2018 Dec;183:504-521. doi: 10.1016/j.neuroimage.2018.08.042. Epub 2018 Aug 18.

引用本文的文献

Subject Harmonization of Digital Biomarkers: Improved Detection of Mild Cognitive Impairment from Language Markers.

Pac Symp Biocomput. 2024;29:187-200.

Determination of Survival of Gastric Cancer Patients With Distant Lymph Node Metastasis Using Prealbumin Level and Prothrombin Time: Contour Plots Based on Random Survival Forest Algorithm on High-Dimensionality Clinical and Laboratory Datasets.

J Gastric Cancer. 2022 Apr;22(2):120-134. doi: 10.5230/jgc.2022.22.e12.

A data mining based clinical decision support system for survival in lung cancer.

Rep Pract Oncol Radiother. 2021 Dec 30;26(6):839-848. doi: 10.5603/RPOR.a2021.0088. eCollection 2021.

Automatic detection of multiple types of pneumonia: Open dataset and a multi-scale attention network.

Biomed Signal Process Control. 2022 Mar;73:103415. doi: 10.1016/j.bspc.2021.103415. Epub 2021 Dec 9.

Evaluation of Feature Selection Methods for Mammographic Breast Cancer Diagnosis in a Unified Framework.

Biomed Res Int. 2021 Oct 4;2021:6079163. doi: 10.1155/2021/6079163. eCollection 2021.

A machine learning method based on the genetic and world competitive contests algorithms for selecting genes or features in biological applications.

Sci Rep. 2021 Feb 8;11(1):3349. doi: 10.1038/s41598-021-82796-y.

Circulating tRNA-derived small RNAs (tsRNAs) signature for the diagnosis and prognosis of breast cancer.

NPJ Breast Cancer. 2021 Jan 5;7(1):4. doi: 10.1038/s41523-020-00211-7.

Training confounder-free deep learning models for medical applications.

Nat Commun. 2020 Nov 26;11(1):6010. doi: 10.1038/s41467-020-19784-9.

Joint prediction and time estimation of COVID-19 developing severe symptoms using chest CT scan.

Med Image Anal. 2021 Jan;67:101824. doi: 10.1016/j.media.2020.101824. Epub 2020 Oct 10.

Novel Machine Learning Identifies Brain Patterns Distinguishing Diagnostic Membership of Human Immunodeficiency Virus, Alcoholism, and Their Comorbidity of Individuals.

Biol Psychiatry Cogn Neurosci Neuroimaging. 2019 Jun;4(6):589-599. doi: 10.1016/j.bpsc.2019.02.003. Epub 2019 Mar 1.

本文引用的文献

Semi-supervised Hierarchical Multimodal Feature and Sample Selection for Alzheimer's Disease Diagnosis.

Med Image Comput Comput Assist Interv. 2016 Oct;9901:79-87. doi: 10.1007/978-3-319-46723-8_10. Epub 2016 Oct 2.

Semi-Supervised Discriminative Classification Robust to Sample-Outliers and Feature-Noises.

IEEE Trans Pattern Anal Mach Intell. 2019 Feb;41(2):515-522. doi: 10.1109/TPAMI.2018.2794470. Epub 2018 Jan 17.

Gray and White Matter Abnormalities in Treated Human Immunodeficiency Virus Disease and Their Relationship to Cognitive Function.

Clin Infect Dis. 2017 Aug 1;65(3):422-432. doi: 10.1093/cid/cix301.

Regionally Specific Brain Volumetric and Cortical Thickness Changes in HIV-Infected Patients in the HAART Era.

J Acquir Immune Defic Syndr. 2017 Apr 15;74(5):563-570. doi: 10.1097/QAI.0000000000001294.

Kernel-based Joint Feature Selection and Max-Margin Classification for Early Diagnosis of Parkinson's Disease.

Sci Rep. 2017 Jan 25;7:41069. doi: 10.1038/srep41069.

A Noise-Filtered Under-Sampling Scheme for Imbalanced Classification.

IEEE Trans Cybern. 2017 Dec;47(12):4263-4274. doi: 10.1109/TCYB.2016.2606104. Epub 2016 Oct 12.

Extracting patterns of morphometry distinguishing HIV associated neurodegeneration from mild cognitive impairment via group cardinality constrained classification.

Hum Brain Mapp. 2016 Dec;37(12):4523-4538. doi: 10.1002/hbm.23326. Epub 2016 Aug 4.

Computing group cardinality constraint solutions for logistic regression problems.

Med Image Anal. 2017 Jan;35:58-69. doi: 10.1016/j.media.2016.05.011. Epub 2016 Jun 11.

Joint feature-sample selection and robust diagnosis of Parkinson's disease from MRI data.

Neuroimage. 2016 Nov 1;141:206-219. doi: 10.1016/j.neuroimage.2016.05.054. Epub 2016 Jun 10.

Subcortical shape and volume abnormalities in an elderly HIV+ cohort.

Proc SPIE Int Soc Opt Eng. 2015 Mar 17;9417. doi: 10.1117/12.2082241.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于约束样本和特征选择的逻辑回归。

Logistic Regression Confined by Cardinality-Constrained Sample and Feature Selection.

出版信息

IEEE Trans Pattern Anal Mach Intell. 2020 Jul;42(7):1713-1728. doi: 10.1109/TPAMI.2019.2901688. Epub 2019 Feb 26.

DOI:10.1109/TPAMI.2019.2901688

PMID:30835210

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7331794/

Abstract

摘要

基于约束样本和特征选择的逻辑回归。

Logistic Regression Confined by Cardinality-Constrained Sample and Feature Selection.

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于约束样本和特征选择的逻辑回归。

Logistic Regression Confined by Cardinality-Constrained Sample and Feature Selection.

出版信息