关联多个单核苷酸多态性和多个疾病表型：惩罚非线性典型相关分析。

Correlating multiple SNPs and multiple disease phenotypes: penalized non-linear canonical correlation analysis.

机构信息

Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, Meibergdreef 9, 1100 DD Amsterdam, The Netherlands.

出版信息

Bioinformatics. 2009 Nov 1;25(21):2764-71. doi: 10.1093/bioinformatics/btp491. Epub 2009 Aug 17.

DOI:10.1093/bioinformatics/btp491

PMID:19689958

Abstract

MOTIVATION

Canonical correlation analysis (CCA) can be used to capture the underlying genetic background of a complex disease, by associating two datasets containing information about a patient's phenotypical and genetic details. Often the genetic information is measured on a qualitative scale, consequently ordinary CCA cannot be applied to such data. Moreover, the size of the data in genetic studies can be enormous, thereby making the results difficult to interpret.

RESULTS

We developed a penalized non-linear CCA approach that can deal with qualitative data by transforming each qualitative variable into a continuous variable through optimal scaling. Additionally, sparse results were obtained by adapting soft-thresholding to this non-linear version of the CCA. By means of simulation studies, we show that our method is capable of extracting relevant variables out of high-dimensional sets. We applied our method to a genetic dataset containing 144 patients with glial cancer.

CONTACT

s.waaijenborg@amc.uva.nl.

摘要

动机

典型相关分析（CCA）可用于通过关联包含患者表型和遗传细节信息的两个数据集，来捕捉复杂疾病的潜在遗传背景。通常，遗传信息是在定性尺度上测量的，因此普通的 CCA 不能应用于此类数据。此外，遗传研究中的数据量可能非常大，从而使结果难以解释。

结果

我们开发了一种惩罚非线性 CCA 方法，通过最优标度将每个定性变量转换为连续变量，从而可以处理定性数据。此外，通过将软阈值应用于 CCA 的这种非线性版本，获得了稀疏结果。通过模拟研究，我们证明了我们的方法能够从高维集合中提取相关变量。我们将我们的方法应用于包含 144 名胶质母细胞瘤患者的遗传数据集。

联系方式

s.waaijenborg@amc.uva.nl.

相似文献

Correlating multiple SNPs and multiple disease phenotypes: penalized non-linear canonical correlation analysis.关联多个单核苷酸多态性和多个疾病表型：惩罚非线性典型相关分析。

Bioinformatics. 2009 Nov 1;25(21):2764-71. doi: 10.1093/bioinformatics/btp491. Epub 2009 Aug 17.

MISS: a non-linear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis.MISS：一种基于互信息的非线性方法，用于群体和同胞对分析中的遗传关联研究。

Bioinformatics. 2010 Aug 1;26(15):1811-8. doi: 10.1093/bioinformatics/btq273. Epub 2010 Jun 18.

Classification with high-dimensional genetic data: assigning patients and genetic features to known classes.利用高维基因数据进行分类：将患者和基因特征归入已知类别。

Biom J. 2008 Dec;50(6):911-26. doi: 10.1002/bimj.200810475.

Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information.利用支持向量机和进化信息预测与单点蛋白质突变相关的人类遗传疾病的发生。

Bioinformatics. 2006 Nov 15;22(22):2729-34. doi: 10.1093/bioinformatics/btl423. Epub 2006 Aug 7.

Bioinformatics tools for single nucleotide polymorphism discovery and analysis.用于单核苷酸多态性发现与分析的生物信息学工具。

Ann N Y Acad Sci. 2004 May;1020:101-9. doi: 10.1196/annals.1310.011.

Unleashing genotypes in epidemiology - A novel method for managing high throughput information.释放流行病学中的基因型——一种管理高通量信息的新方法。

J Biomed Inform. 2009 Dec;42(6):1029-34. doi: 10.1016/j.jbi.2009.07.005. Epub 2009 Jul 17.

Quantifying the association between gene expressions and DNA-markers by penalized canonical correlation analysis.通过惩罚典型相关分析量化基因表达与DNA标记之间的关联。

Stat Appl Genet Mol Biol. 2008;7(1):Article3. doi: 10.2202/1544-6115.1329. Epub 2008 Jan 23.

Cross-platform comparison and visualisation of gene expression data using co-inertia analysis.使用共惯性分析对基因表达数据进行跨平台比较和可视化

BMC Bioinformatics. 2003 Nov 21;4:59. doi: 10.1186/1471-2105-4-59.

Robust sparse canonical correlation analysis.稳健稀疏典型相关分析

BMC Syst Biol. 2016 Aug 11;10(1):72. doi: 10.1186/s12918-016-0317-9.

Tag SNP selection in genotype data for maximizing SNP prediction accuracy.在基因型数据中选择标签单核苷酸多态性以最大化单核苷酸多态性预测准确性。

Bioinformatics. 2005 Jun;21 Suppl 1:i195-203. doi: 10.1093/bioinformatics/bti1021.

引用本文的文献

Proc Natl Acad Sci U S A. 2022 Dec 6;119(49):e2207181119. doi: 10.1073/pnas.2207181119. Epub 2022 Dec 2.

Sparse models for correlative and integrative analysis of imaging and genetic data.用于成像和基因数据相关及综合分析的稀疏模型。

J Neurosci Methods. 2014 Nov 30;237:69-78. doi: 10.1016/j.jneumeth.2014.09.001. Epub 2014 Sep 9.

Correspondence between fMRI and SNP data by group sparse canonical correlation analysis.通过组稀疏典型相关分析实现功能磁共振成像（fMRI）数据与单核苷酸多态性（SNP）数据之间的对应关系。

Med Image Anal. 2014 Aug;18(6):891-902. doi: 10.1016/j.media.2013.10.010. Epub 2013 Oct 31.

Group sparse canonical correlation analysis for genomic data integration.基于组稀疏典型相关分析的基因组数据整合。

BMC Bioinformatics. 2013 Aug 12;14:245. doi: 10.1186/1471-2105-14-245.

Bi-directional gene set enrichment and canonical correlation analysis identify key diet-sensitive pathways and biomarkers of metabolic syndrome.双向基因集富集和典型相关分析确定代谢综合征的关键饮食敏感途径和生物标志物。

BMC Bioinformatics. 2010 Oct 7;11:499. doi: 10.1186/1471-2105-11-499.

Association of repeatedly measured intermediate risk factors for complex diseases with high dimensional SNP data.复杂疾病重复测量的中间风险因素与高维单核苷酸多态性数据的关联

Algorithms Mol Biol. 2010 Feb 11;5:17. doi: 10.1186/1748-7188-5-17.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

关联多个单核苷酸多态性和多个疾病表型：惩罚非线性典型相关分析。

Correlating multiple SNPs and multiple disease phenotypes: penalized non-linear canonical correlation analysis.

机构信息

出版信息

MOTIVATION

RESULTS

CONTACT

动机

结果

联系方式

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献