Suppr超能文献

使用扩展的 SIS R 包进行组学特征选择:在 Strong Heart 研究中鉴定出体重指数的表观遗传多标记物。

Omics feature selection with the extended SIS R package: identification of a body mass index epigenetic multimarker in the Strong Heart Study.

出版信息

Am J Epidemiol. 2024 Jul 8;193(7):1010-1018. doi: 10.1093/aje/kwae006.

Abstract

The statistical analysis of omics data poses a great computational challenge given their ultra-high-dimensional nature and frequent between-features correlation. In this work, we extended the iterative sure independence screening (ISIS) algorithm by pairing ISIS with elastic-net (Enet) and 2 versions of adaptive elastic-net (adaptive elastic-net (AEnet) and multistep adaptive elastic-net (MSAEnet)) to efficiently improve feature selection and effect estimation in omics research. We subsequently used genome-wide human blood DNA methylation data from American Indian participants in the Strong Heart Study (n = 2235 participants; measured in 1989-1991) to compare the performance (predictive accuracy, coefficient estimation, and computational efficiency) of ISIS-paired regularization methods with that of a bayesian shrinkage and traditional linear regression to identify an epigenomic multimarker of body mass index (BMI). ISIS-AEnet outperformed the other methods in prediction. In biological pathway enrichment analysis of genes annotated to BMI-related differentially methylated positions, ISIS-AEnet captured most of the enriched pathways in common for at least 2 of all the evaluated methods. ISIS-AEnet can favor biological discovery because it identifies the most robust biological pathways while achieving an optimal balance between bias and efficient feature selection. In the extended SIS R package, we also implemented ISIS paired with Cox and logistic regression for time-to-event and binary endpoints, respectively, and a bootstrap approach for the estimation of regression coefficients.

摘要

由于组学数据具有超高维特性和频繁的特征间相关性,因此对其进行统计分析是一项巨大的计算挑战。在这项工作中,我们通过将 ISIS 与弹性网络(Enet)和 2 种自适应弹性网络(adaptive elastic-net (AEnet) 和 multistep adaptive elastic-net (MSAEnet))配对,扩展了迭代独立筛选(ISIS)算法,以有效地提高组学研究中的特征选择和效果估计。随后,我们使用来自美国印第安人参与者的全基因组人类血液 DNA 甲基化数据(Strong Heart Study,n = 2235 名参与者;1989-1991 年测量),比较了 ISIS 配对正则化方法与贝叶斯收缩和传统线性回归的性能(预测准确性、系数估计和计算效率),以识别身体质量指数(BMI)的表观基因组多标记物。ISIS-AEnet 在预测方面优于其他方法。在注释为 BMI 相关差异甲基化位置的基因的生物学途径富集分析中,ISIS-AEnet 捕获了大多数至少有 2 种评估方法共同富集的途径。ISIS-AEnet 可以有利于生物学发现,因为它可以识别最稳健的生物学途径,同时在偏差和有效特征选择之间实现最佳平衡。在扩展的 SIS R 包中,我们还分别实现了 ISIS 与 Cox 和逻辑回归配对,用于时间事件和二项结局,以及用于回归系数估计的自举方法。

相似文献

5
Prognostic factors for return to work in breast cancer survivors.乳腺癌幸存者恢复工作的预后因素。
Cochrane Database Syst Rev. 2025 May 7;5(5):CD015124. doi: 10.1002/14651858.CD015124.pub2.
7
Aural toilet (ear cleaning) for chronic suppurative otitis media.慢性化脓性中耳炎的耳道清理(耳部清洁)
Cochrane Database Syst Rev. 2025 Jun 9;6(6):CD013057. doi: 10.1002/14651858.CD013057.pub3.

本文引用的文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验