Suppr超能文献

关联多个单核苷酸多态性和多个疾病表型:惩罚非线性典型相关分析。

Correlating multiple SNPs and multiple disease phenotypes: penalized non-linear canonical correlation analysis.

机构信息

Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, Meibergdreef 9, 1100 DD Amsterdam, The Netherlands.

出版信息

Bioinformatics. 2009 Nov 1;25(21):2764-71. doi: 10.1093/bioinformatics/btp491. Epub 2009 Aug 17.

Abstract

MOTIVATION

Canonical correlation analysis (CCA) can be used to capture the underlying genetic background of a complex disease, by associating two datasets containing information about a patient's phenotypical and genetic details. Often the genetic information is measured on a qualitative scale, consequently ordinary CCA cannot be applied to such data. Moreover, the size of the data in genetic studies can be enormous, thereby making the results difficult to interpret.

RESULTS

We developed a penalized non-linear CCA approach that can deal with qualitative data by transforming each qualitative variable into a continuous variable through optimal scaling. Additionally, sparse results were obtained by adapting soft-thresholding to this non-linear version of the CCA. By means of simulation studies, we show that our method is capable of extracting relevant variables out of high-dimensional sets. We applied our method to a genetic dataset containing 144 patients with glial cancer.

CONTACT

s.waaijenborg@amc.uva.nl.

摘要

动机

典型相关分析(CCA)可用于通过关联包含患者表型和遗传细节信息的两个数据集,来捕捉复杂疾病的潜在遗传背景。通常,遗传信息是在定性尺度上测量的,因此普通的 CCA 不能应用于此类数据。此外,遗传研究中的数据量可能非常大,从而使结果难以解释。

结果

我们开发了一种惩罚非线性 CCA 方法,通过最优标度将每个定性变量转换为连续变量,从而可以处理定性数据。此外,通过将软阈值应用于 CCA 的这种非线性版本,获得了稀疏结果。通过模拟研究,我们证明了我们的方法能够从高维集合中提取相关变量。我们将我们的方法应用于包含 144 名胶质母细胞瘤患者的遗传数据集。

联系方式

s.waaijenborg@amc.uva.nl.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验