Suppr超能文献

一种用于指导疾病亚型分类的基因关联筛选稀疏专家混合模型。

A Sparse Mixture-of-Experts Model With Screening of Genetic Associations to Guide Disease Subtyping.

作者信息

Courbariaux Marie, De Santiago Kylliann, Dalmasso Cyril, Danjou Fabrice, Bekadar Samir, Corvol Jean-Christophe, Martinez Maria, Szafranski Marie, Ambroise Christophe

机构信息

Université Paris-Saclay, CNRS, Université d'Évry, Laboratoire de Mathématiques et Modélisation d'Évry, Évry-Courcouronnes, France.

Sorbonne Université, Paris Brain Institute-ICM, Inserm, CNRS, Assistance Publique Hôpitaux de Paris, Pitié-Salpêtrière Hospital, Department of Neurology, Paris, France.

出版信息

Front Genet. 2022 Jun 6;13:859462. doi: 10.3389/fgene.2022.859462. eCollection 2022.

Abstract

Identifying new genetic associations in non-Mendelian complex diseases is an increasingly difficult challenge. These diseases sometimes appear to have a significant component of heritability requiring explanation, and this missing heritability may be due to the existence of subtypes involving different genetic factors. Taking genetic information into account in clinical trials might potentially have a role in guiding the process of subtyping a complex disease. Most methods dealing with multiple sources of information rely on data transformation, and in disease subtyping, the two main strategies used are 1) the clustering of clinical data followed by posterior genetic analysis and 2) the concomitant clustering of clinical and genetic variables. Both of these strategies have limitations that we propose to address. This work proposes an original method for disease subtyping on the basis of both longitudinal clinical variables and high-dimensional genetic markers a sparse mixture-of-regressions model. The added value of our approach lies in its interpretability in relation to two aspects. First, our model links both clinical and genetic data with regard to their initial nature (i.e., without transformation) and does not require post-processing where the original information is accessed a second time to interpret the subtypes. Second, it can address large-scale problems because of a variable selection step that is used to discard genetic variables that may not be relevant for subtyping. The proposed method was validated on simulations. A dataset from a cohort of Parkinson's disease patients was also analyzed. Several subtypes of the disease and genetic variants that potentially have a role in this typology were identified. The R code for the proposed method, named DiSuGen, and a tutorial are available for download (see the references).

摘要

在非孟德尔复杂疾病中识别新的基因关联是一项日益艰巨的挑战。这些疾病有时似乎具有显著的遗传成分需要解释,而这种缺失的遗传度可能是由于存在涉及不同遗传因素的亚型。在临床试验中考虑遗传信息可能在指导复杂疾病的亚型分类过程中发挥作用。大多数处理多种信息来源的方法都依赖于数据转换,在疾病亚型分类中,使用的两种主要策略是:1)对临床数据进行聚类,然后进行后续的基因分析;2)对临床和基因变量进行联合聚类。我们提出要解决这两种策略都存在的局限性。这项工作基于纵向临床变量和高维基因标记提出了一种用于疾病亚型分类的原创方法——一种稀疏回归混合模型。我们方法的附加值在于它在两个方面具有可解释性。首先,我们的模型在临床和基因数据的原始性质方面(即不进行转换)将两者联系起来,并且不需要在第二次访问原始信息以解释亚型时进行后处理。其次,由于使用了变量选择步骤来丢弃可能与亚型分类无关的基因变量,它可以解决大规模问题。所提出的方法在模拟中得到了验证。还分析了一组帕金森病患者的数据集。确定了该疾病的几种亚型以及可能在这种分类中起作用的基因变体。所提出的名为DiSuGen的方法的R代码和教程可供下载(见参考文献)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aed8/9207464/64c453ad4f4b/fgene-13-859462-g003.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验