Suppr超能文献

疾病建模中基于基因-环境相互作用的分层多标签分类

Hierarchical Multi-Label Classification With Gene-Environment Interactions in Disease Modeling.

作者信息

Li Jingmao, Zhang Qingzhao, Ma Shuangge, Fang Kuangnan, Xu Yaqing

机构信息

Department of Statistics and Data Science, School of Economics, Xiamen University, Fujian, China.

The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen, China.

出版信息

Stat Med. 2025 Feb 10;44(3-4):e10330. doi: 10.1002/sim.10330.

Abstract

In biomedical studies, gene-environment (G-E) interactions have been demonstrated to have important implications for analyzing disease outcomes beyond the main G and main E effects. Many approaches have been developed for G-E interaction analysis, yielding important findings. However, hierarchical multi-label classification, which provides insightful information on disease outcomes, remains unexplored in G-E analysis literature. Moreover, unlabeled data are commonly observed in practical settings but omitted by many existing methods of hierarchical multi-label classification. In this study, we consider a semi-supervised scenario and develop a novel approach for the two-layer hierarchical response with G-E interactions. A two-step penalized estimation is then proposed using an efficient expectation-maximization (EM) algorithm. Simulation shows that it has superior performance in classification and feature selection. The analysis of The Cancer Genome Atlas (TCGA) data on lung cancer demonstrates the practical utility of the proposed method. Overall, this study can fill the important knowledge gap in G-E interaction analysis by providing a widely applicable framework for hierarchical multi-label classification of complex disease outcomes.

摘要

在生物医学研究中,基因-环境(G-E)相互作用已被证明对于分析超出主要基因(G)和主要环境(E)效应的疾病结局具有重要意义。已经开发了许多用于G-E相互作用分析的方法,并取得了重要发现。然而,分层多标签分类能够提供有关疾病结局的深刻见解,但在G-E分析文献中尚未得到探索。此外,在实际情况中经常会观察到未标记的数据,但许多现有的分层多标签分类方法都忽略了这些数据。在本研究中,我们考虑一种半监督场景,并开发了一种用于具有G-E相互作用的两层分层响应的新方法。然后使用高效的期望最大化(EM)算法提出了一种两步惩罚估计方法。模拟结果表明,该方法在分类和特征选择方面具有优越的性能。对癌症基因组图谱(TCGA)肺癌数据的分析证明了所提出方法的实际效用。总体而言,本研究通过为复杂疾病结局的分层多标签分类提供一个广泛适用的框架,可以填补G-E相互作用分析中的重要知识空白。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9dea/12201914/90c026407aa0/nihms-2088546-f0001.jpg

相似文献

本文引用的文献

1
Gene-environment interactions in human health.人类健康中的基因-环境相互作用。
Nat Rev Genet. 2024 Nov;25(11):768-784. doi: 10.1038/s41576-024-00731-z. Epub 2024 May 28.
9
PUlasso: High-Dimensional Variable Selection With Presence-Only Data.PUlasso:仅存在数据下的高维变量选择
J Am Stat Assoc. 2019;115(529):334-347. doi: 10.1080/01621459.2018.1546587. Epub 2019 Apr 11.
10
Structured gene-environment interaction analysis.结构基因-环境交互作用分析。
Biometrics. 2020 Mar;76(1):23-35. doi: 10.1111/biom.13139. Epub 2019 Oct 9.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验