Suppr超能文献

一种具有正则化和混杂因素调整的 cis-eQTL 映射增强型机器学习工具。

An enhanced machine learning tool for cis-eQTL mapping with regularization and confounder adjustments.

机构信息

School of Public Health, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China.

Department of Biostatistics, Yale University, New Haven, Connecticut.

出版信息

Genet Epidemiol. 2020 Nov;44(8):798-810. doi: 10.1002/gepi.22341. Epub 2020 Jul 22.

Abstract

Many expression quantitative trait loci (eQTL) studies have been conducted to investigate the biological effects of variants in gene regulation. However, these eQTL studies may suffer from low or moderate statistical power and overly conservative false-discovery rate. In practice, most algorithms for eQTL identification do not model the joint effects of multiple genetic variants with weak or moderate influence. Here we present a novel machine-learning algorithm, lasso least-squares kernel machine (LSKM-LASSO) that model the association between multiple genetic variants and phenotypic traits simultaneously with the existence of nongenetic and genetic confounding. With a more general and flexible framework for the estimation of genetic confounding, LSKM-LASSO is able to provide a more accurate evaluation of the joint effects of multiple genetic variants. Our simulations demonstrate that our approach outperforms three state-of-the-art alternatives in terms of eQTL identification and phenotype prediction. We then apply our method to genotype and gene expression data of 11 tissues obtained from the Genotype-Tissue Expression project. Our algorithm was able to identify more genes with eQTL than other algorithms. By incorporating a regularization term and combining it with least-squares kernel machine, LSKM-LASSO provides a powerful tool for eQTL mapping and phenotype prediction.

摘要

许多表达数量性状基因座 (eQTL) 研究已经进行,以研究基因调控中变异的生物学效应。然而,这些 eQTL 研究可能受到低或中度统计功效和过度保守的错误发现率的影响。在实践中,大多数用于鉴定 eQTL 的算法并没有对具有弱或中等影响的多个遗传变异的联合效应进行建模。在这里,我们提出了一种新的机器学习算法,lasso 最小二乘核机 (LSKM-LASSO),它可以同时对多个遗传变异与表型特征之间的关联进行建模,并且存在非遗传和遗传混杂。通过更一般和灵活的遗传混杂估计框架,LSKM-LASSO 能够更准确地评估多个遗传变异的联合效应。我们的模拟表明,我们的方法在 eQTL 鉴定和表型预测方面优于三种最先进的替代方法。然后,我们将我们的方法应用于来自基因-组织表达项目的 11 种组织的基因型和基因表达数据。我们的算法能够比其他算法识别出更多的具有 eQTL 的基因。通过结合正则化项和最小二乘核机,LSKM-LASSO 为 eQTL 映射和表型预测提供了一个强大的工具。

相似文献

10
Data-driven assessment of eQTL mapping methods.基于数据驱动的 eQTL 映射方法评估。
BMC Genomics. 2010 Sep 17;11:502. doi: 10.1186/1471-2164-11-502.

本文引用的文献

8
Transcriptional and Post-transcriptional Gene Regulation by Long Non-coding RNA.长链非编码RNA介导的转录及转录后基因调控
Genomics Proteomics Bioinformatics. 2017 Jun;15(3):177-186. doi: 10.1016/j.gpb.2016.12.005. Epub 2017 May 19.
10

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验