• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于分析流行病学研究中大规模匹配数据的稀疏条件逻辑回归:一种简单算法

Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm.

作者信息

Avalos Marta, Pouyes Hélène, Grandvalet Yves, Orriols Ludivine, Lagarde Emmanuel

出版信息

BMC Bioinformatics. 2015;16 Suppl 6(Suppl 6):S1. doi: 10.1186/1471-2105-16-S6-S1. Epub 2015 Apr 17.

DOI:10.1186/1471-2105-16-S6-S1
PMID:25916593
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4416185/
Abstract

This paper considers the problem of estimation and variable selection for large high-dimensional data (high number of predictors p and large sample size N, without excluding the possibility that N < p) resulting from an individually matched case-control study. We develop a simple algorithm for the adaptation of the Lasso and related methods to the conditional logistic regression model. Our proposal relies on the simplification of the calculations involved in the likelihood function. Then, the proposed algorithm iteratively solves reweighted Lasso problems using cyclical coordinate descent, computed along a regularization path. This method can handle large problems and deal with sparse features efficiently. We discuss benefits and drawbacks with respect to the existing available implementations. We also illustrate the interest and use of these techniques on a pharmacoepidemiological study of medication use and traffic safety.

摘要

本文考虑了个体匹配病例对照研究产生的大型高维数据(预测变量数量p多且样本量N大,不排除N < p的可能性)的估计和变量选择问题。我们开发了一种简单算法,用于使套索(Lasso)及相关方法适用于条件逻辑回归模型。我们的提议依赖于似然函数中所涉及计算的简化。然后,所提出的算法使用循环坐标下降法沿着正则化路径迭代求解加权套索问题。该方法能够处理大型问题并有效处理稀疏特征。我们讨论了相对于现有可用实现方式的优缺点。我们还通过一项关于药物使用与交通安全的药物流行病学研究说明了这些技术的意义和用途。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17c9/4416185/2288504f6a55/1471-2105-16-S6-S1-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17c9/4416185/52d3def66cab/1471-2105-16-S6-S1-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17c9/4416185/2288504f6a55/1471-2105-16-S6-S1-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17c9/4416185/52d3def66cab/1471-2105-16-S6-S1-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/17c9/4416185/2288504f6a55/1471-2105-16-S6-S1-2.jpg

相似文献

1
Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm.用于分析流行病学研究中大规模匹配数据的稀疏条件逻辑回归:一种简单算法
BMC Bioinformatics. 2015;16 Suppl 6(Suppl 6):S1. doi: 10.1186/1471-2105-16-S6-S1. Epub 2015 Apr 17.
2
The generalized LASSO.广义套索算法
IEEE Trans Neural Netw. 2004 Jan;15(1):16-28. doi: 10.1109/TNN.2003.809398.
3
Regularization Paths for Generalized Linear Models via Coordinate Descent.基于坐标下降法的广义线性模型正则化路径
J Stat Softw. 2010;33(1):1-22.
4
Sparse inverse covariance estimation with the graphical lasso.使用图模型选择法进行稀疏逆协方差估计。
Biostatistics. 2008 Jul;9(3):432-41. doi: 10.1093/biostatistics/kxm045. Epub 2007 Dec 12.
5
Network-based regularization for high dimensional SNP data in the case-control study of Type 2 diabetes.2型糖尿病病例对照研究中高维单核苷酸多态性数据的基于网络的正则化方法
BMC Genet. 2017 May 16;18(1):44. doi: 10.1186/s12863-017-0495-5.
6
Gene selection in cancer classification using sparse logistic regression with Bayesian regularization.使用带贝叶斯正则化的稀疏逻辑回归进行癌症分类中的基因选择。
Bioinformatics. 2006 Oct 1;22(19):2348-55. doi: 10.1093/bioinformatics/btl386. Epub 2006 Jul 14.
7
Analysis of multiple exposures in the case-crossover design via sparse conditional likelihood.稀疏条件似然法在病例交叉设计中多次暴露分析。
Stat Med. 2012 Sep 20;31(21):2290-302. doi: 10.1002/sim.5344. Epub 2012 Mar 15.
8
Sparse partial least-squares regression for high-throughput survival data analysis.用于高通量生存数据分析的稀疏偏最小二乘回归
Stat Med. 2013 Dec 30;32(30):5340-52. doi: 10.1002/sim.5975. Epub 2013 Sep 18.
9
glmgraph: an R package for variable selection and predictive modeling of structured genomic data.glmgraph:一个用于结构化基因组数据变量选择和预测建模的R包。
Bioinformatics. 2015 Dec 15;31(24):3991-3. doi: 10.1093/bioinformatics/btv497. Epub 2015 Aug 26.
10
Problems due to small samples and sparse data in conditional logistic regression analysis.条件逻辑回归分析中因样本量小和数据稀疏而产生的问题。
Am J Epidemiol. 2000 Mar 1;151(5):531-9. doi: 10.1093/oxfordjournals.aje.a010240.

引用本文的文献

1
Identifying county-level effect modifiers of the association between heat waves and preterm birth using a Bayesian spatial meta regression approach.使用贝叶斯空间元回归方法识别热浪与早产之间关联的县级效应修饰因素。
medRxiv. 2025 Jul 8:2025.07.03.25330695. doi: 10.1101/2025.07.03.25330695.
2
Random forests for the analysis of matched case-control studies.随机森林在匹配病例对照研究中的分析。
BMC Bioinformatics. 2024 Aug 1;25(1):253. doi: 10.1186/s12859-024-05877-5.
3
penalizedclr: an R package for penalized conditional logistic regression for integration of multiple omics layers.

本文引用的文献

1
Regularization Paths for Cox's Proportional Hazards Model via Coordinate Descent.通过坐标下降法求解Cox比例风险模型的正则化路径
J Stat Softw. 2011 Mar;39(5):1-13. doi: 10.18637/jss.v039.i05.
2
Regularization Paths for Conditional Logistic Regression: The clogitL1 Package.条件逻辑回归的正则化路径:clogitL1 包
J Stat Softw. 2014 Jul;58(12).
3
Variable selection and prediction using a nested, matched case-control study: Application to hospital acquired pneumonia in stroke patients.使用嵌套匹配病例对照研究进行变量选择和预测:在中风患者医院获得性肺炎中的应用
penalizedclr:一个用于惩罚条件逻辑回归的 R 包,用于整合多个组学层。
BMC Bioinformatics. 2024 Jun 27;25(1):226. doi: 10.1186/s12859-024-05850-2.
4
Association between number of medications and hip fractures in Japanese elderly using conditional logistic LASSO regression.使用条件逻辑 LASSO 回归分析日本老年人用药种类与髋部骨折的关系。
Sci Rep. 2023 Oct 6;13(1):16831. doi: 10.1038/s41598-023-43876-3.
5
Plasma protein biomarkers predict the development of persistent autoantibodies and type 1 diabetes 6 months prior to the onset of autoimmunity.血浆蛋白生物标志物可预测自身免疫发生前 6 个月持续自身抗体和 1 型糖尿病的发展。
Cell Rep Med. 2023 Jul 18;4(7):101093. doi: 10.1016/j.xcrm.2023.101093. Epub 2023 Jun 29.
6
Circulating amino acids and amino acid-related metabolites and risk of breast cancer among predominantly premenopausal women.主要为绝经前女性的循环氨基酸及氨基酸相关代谢物与乳腺癌风险
NPJ Breast Cancer. 2021 May 18;7(1):54. doi: 10.1038/s41523-021-00262-4.
7
Extending Classification Algorithms to Case-Control Studies.将分类算法扩展到病例对照研究。
Biomed Eng Comput Biol. 2019 Jul 15;10:1179597219858954. doi: 10.1177/1179597219858954. eCollection 2019.
8
Learning-based CBCT correction using alternating random forest based on auto-context model.基于自上下文模型的交替随机森林的基于学习的 CBCT 校正。
Med Phys. 2019 Feb;46(2):601-618. doi: 10.1002/mp.13295. Epub 2018 Dec 11.
9
Magnetic resonance imaging-based pseudo computed tomography using anatomic signature and joint dictionary learning.基于磁共振成像的伪计算机断层扫描:利用解剖特征和联合字典学习
J Med Imaging (Bellingham). 2018 Jul;5(3):034001. doi: 10.1117/1.JMI.5.3.034001. Epub 2018 Aug 24.
10
Travel to farms in the lowlands and inadequate malaria information significantly predict malaria in villages around Lake Tana, northwest Ethiopia: a matched case-control study.前往低地农场和疟疾信息不足的地区会显著预测埃塞俄比亚西北部塔纳湖周边村庄的疟疾:一项匹配病例对照研究。
Malar J. 2018 Aug 10;17(1):290. doi: 10.1186/s12936-018-2434-y.
Biometrics. 2014 Mar;70(1):153-63. doi: 10.1111/biom.12113. Epub 2013 Dec 9.
4
Effects of aggregation of drug and diagnostic codes on the performance of the high-dimensional propensity score algorithm: an empirical example.药物和诊断代码聚合对高维倾向评分算法性能的影响:实证示例。
BMC Med Res Methodol. 2013 Nov 19;13:142. doi: 10.1186/1471-2288-13-142.
5
Variable selection on large case-crossover data: application to a registry-based study of prescription drugs and road traffic crashes.大数据交叉病例研究中的变量选择:应用于基于登记处的处方药与道路交通事故研究
Pharmacoepidemiol Drug Saf. 2014 Feb;23(2):140-51. doi: 10.1002/pds.3539. Epub 2013 Oct 18.
6
A comparison of 12 algorithms for matching on the propensity score.匹配倾向评分的 12 种算法比较。
Stat Med. 2014 Mar 15;33(6):1057-69. doi: 10.1002/sim.6004. Epub 2013 Oct 7.
7
Is size the next big thing in epidemiology?规模会成为流行病学的下一个重大课题吗?
Epidemiology. 2013 May;24(3):349-51. doi: 10.1097/EDE.0b013e31828ac65e.
8
Network-based regularization for matched case-control analysis of high-dimensional DNA methylation data.基于网络的正则化在高维 DNA 甲基化数据匹配病例对照分析中的应用。
Stat Med. 2013 May 30;32(12):2127-39. doi: 10.1002/sim.5694. Epub 2012 Dec 5.
9
Stratification-score matching improves correction for confounding by population stratification in case-control association studies.分层评分匹配可改善病例对照关联研究中因群体分层导致的混杂校正。
Genet Epidemiol. 2012 Apr;36(3):195-205. doi: 10.1002/gepi.21611.
10
Analysis of multiple exposures in the case-crossover design via sparse conditional likelihood.稀疏条件似然法在病例交叉设计中多次暴露分析。
Stat Med. 2012 Sep 20;31(21):2290-302. doi: 10.1002/sim.5344. Epub 2012 Mar 15.