Suppr超能文献

应用于眼科和基因组数据的套索-模式搜索算法

LASSO-Patternsearch algorithm with application to ophthalmology and genomic data.

作者信息

Shi Weiliang, Wahba Grace, Wright Stephen, Lee Kristine, Klein Ronald, Klein Barbara

机构信息

Department of Statistics, University of Wisconsin, 1300 University Avenue, Madison WI 53706, E-mail address:

出版信息

Stat Interface. 2008;1(1):137-153. doi: 10.4310/sii.2008.v1.n1.a12.

Abstract

The LASSO-Patternsearch algorithm is proposed to efficiently identify patterns of multiple dichotomous risk factors for outcomes of interest in demographic and genomic studies. The patterns considered are those that arise naturally from the log linear expansion of the multivariate Bernoulli density. The method is designed for the case where there is a possibly very large number of candidate patterns but it is believed that only a relatively small number are important. A LASSO is used to greatly reduce the number of candidate patterns, using a novel computational algorithm that can handle an extremely large number of unknowns simultaneously. The patterns surviving the LASSO are further pruned in the framework of (parametric) generalized linear models. A novel tuning procedure based on the GACV for Bernoulli outcomes, modified to act as a model selector, is used at both steps. We applied the method to myopia data from the population-based Beaver Dam Eye Study, exposing physiologically interesting interacting risk factors. We then applied the the method to data from a generative model of Rheumatoid Arthritis based on Problem 3 from the Genetic Analysis Workshop 15, successfully demonstrating its potential to efficiently recover higher order patterns from attribute vectors of length typical of genomic studies.

摘要

提出了LASSO模式搜索算法,以有效地识别在人口统计学和基因组研究中与感兴趣的结果相关的多个二分风险因素的模式。所考虑的模式是那些从多元伯努利密度的对数线性展开中自然产生的模式。该方法适用于存在大量候选模式,但据信只有相对少数模式重要的情况。使用LASSO通过一种能够同时处理大量未知数的新颖计算算法来大幅减少候选模式的数量。在(参数化)广义线性模型框架内,对通过LASSO筛选出的模式进一步进行精简。在两个步骤中都使用了一种基于用于伯努利结果的GACV的新颖调整程序,该程序经过修改后用作模型选择器。我们将该方法应用于基于人群的比弗戴姆眼研究中的近视数据,揭示了具有生理意义的相互作用风险因素。然后,我们将该方法应用于基于遗传分析研讨会15的问题3生成的类风湿性关节炎模型的数据,成功证明了其从基因组研究中典型长度的属性向量有效恢复高阶模式的潜力。

相似文献

2
Detecting disease-causing genes by LASSO-Patternsearch algorithm.利用LASSO模式搜索算法检测致病基因。
BMC Proc. 2007;1 Suppl 1(Suppl 1):S60. doi: 10.1186/1753-6561-1-s1-s60. Epub 2007 Dec 18.
5
A fast solution to the lasso problem with equality constraints.一种带有等式约束的套索问题的快速解决方案。
J Comput Graph Stat. 2024;33(3):804-813. doi: 10.1080/10618600.2023.2277877. Epub 2023 Dec 26.

本文引用的文献

1
Detecting disease-causing genes by LASSO-Patternsearch algorithm.利用LASSO模式搜索算法检测致病基因。
BMC Proc. 2007;1 Suppl 1(Suppl 1):S60. doi: 10.1186/1753-6561-1-s1-s60. Epub 2007 Dec 18.
2
Picking single-nucleotide polymorphisms in forests.在森林中挑选单核苷酸多态性。
BMC Proc. 2007;1 Suppl 1(Suppl 1):S59. doi: 10.1186/1753-6561-1-s1-s59. Epub 2007 Dec 18.
5
Rheumatoid arthritis association at 6q23.位于6q23的类风湿关节炎关联
Nat Genet. 2007 Dec;39(12):1431-3. doi: 10.1038/ng.2007.32. Epub 2007 Nov 4.
6
Penalized logistic regression for detecting gene interactions.用于检测基因相互作用的惩罚逻辑回归
Biostatistics. 2008 Jan;9(1):30-50. doi: 10.1093/biostatistics/kxm010. Epub 2007 Apr 11.
8
Soft and hard classification by reproducing kernel Hilbert space methods.基于再生核希尔伯特空间方法的软硬分类
Proc Natl Acad Sci U S A. 2002 Dec 24;99(26):16524-30. doi: 10.1073/pnas.242574899. Epub 2002 Dec 11.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验