Suppr超能文献

重叠群组筛选法检测基因-环境相互作用及其在 TCGA 高维生存基因组数据中的应用。

Overlapping group screening for detection of gene-environment interactions with application to TCGA high-dimensional survival genomic data.

机构信息

Department of Statistics, Feng Chia University, Seatwen, Taichung, 40724, Taiwan.

Institute of Statistical Science, Academia Sinica, Nankang, Taipei, 11529, Taiwan.

出版信息

BMC Bioinformatics. 2022 May 30;23(1):202. doi: 10.1186/s12859-022-04750-7.

Abstract

BACKGROUND

In the context of biomedical and epidemiological research, gene-environment (G-E) interaction is of great significance to the etiology and progression of many complex diseases. In high-dimensional genetic data, two general models, marginal and joint models, are proposed to identify important interaction factors. Most existing approaches for identifying G-E interactions are limited owing to the lack of robustness to outliers/contamination in response and predictor data. In particular, right-censored survival outcomes make the associated feature screening even challenging. In this article, we utilize the overlapping group screening (OGS) approach to select important G-E interactions related to clinical survival outcomes by incorporating the gene pathway information under a joint modeling framework.

RESULTS

Simulation studies under various scenarios are carried out to compare the performances of our proposed method with some commonly used methods. In the real data applications, we use our proposed method to identify G-E interactions related to the clinical survival outcomes of patients with head and neck squamous cell carcinoma, and esophageal carcinoma in The Cancer Genome Atlas clinical survival genetic data, and further establish corresponding survival prediction models. Both simulation and real data studies show that our method performs well and outperforms existing methods in the G-E interaction selection, effect estimation, and survival prediction accuracy.

CONCLUSIONS

The OGS approach is useful for selecting important environmental factors, genes and G-E interactions in the ultra-high dimensional feature space. The prediction ability of OGS with the Lasso penalty is better than existing methods. The same idea of the OGS approach can apply to other outcome models, such as the proportional odds survival time model, the logistic regression model for binary outcomes, and the multinomial logistic regression model for multi-class outcomes.

摘要

背景

在生物医学和流行病学研究中,基因-环境(G-E)相互作用对于许多复杂疾病的病因和进展具有重要意义。在高维遗传数据中,提出了两种一般模型,即边际模型和联合模型,以识别重要的相互作用因素。由于响应和预测数据中存在异常值/污染,大多数现有的识别 G-E 相互作用的方法都存在局限性。特别是,右删失生存结局使得相关特征筛选更加具有挑战性。在本文中,我们利用重叠组筛选(OGS)方法,通过联合建模框架,结合基因通路信息,选择与临床生存结局相关的重要 G-E 相互作用。

结果

在各种情况下进行了模拟研究,以比较我们提出的方法与一些常用方法的性能。在真实数据应用中,我们使用我们提出的方法来识别与头颈鳞状细胞癌和食管腺癌患者临床生存结局相关的 G-E 相互作用,以及在癌症基因组图谱临床生存遗传数据中,进一步建立相应的生存预测模型。模拟和真实数据研究均表明,我们的方法表现良好,在 G-E 相互作用选择、效应估计和生存预测准确性方面优于现有方法。

结论

OGS 方法可用于在超高维特征空间中选择重要的环境因素、基因和 G-E 相互作用。带有 Lasso 惩罚的 OGS 的预测能力优于现有方法。OGS 方法的相同思想可应用于其他结局模型,如比例风险生存时间模型、二分类结局的逻辑回归模型和多分类结局的多项逻辑回归模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/36ed/9150322/088d9e999031/12859_2022_4750_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验