自动数量性状基因座分析（AutoQTL）。

Automated quantitative trait locus analysis (AutoQTL).

作者信息

Freda Philip J, Ghosh Attri, Zhang Elizabeth, Luo Tianhao, Chitre Apurva, Polesskaya Oksana, St Pierre Celine L, Gao Jianjun, Martin Connor D, Chen Hao, Garcia-Martinez Angel G, Wang Tengfei, Han Wenyan, Ishiwari Keita, Meyer Paul, Lamparelli Alexander, King Christopher P, Palmer Abraham A, Li Ruowang, Moore Jason H

出版信息

bioRxiv. 2023 Jan 13:2023.01.12.523835. doi: 10.1101/2023.01.12.523835.

DOI:10.1101/2023.01.12.523835

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9882220/

Abstract

BACKGROUND

Quantitative Trait Locus (QTL) analysis and Genome-Wide Association Studies (GWAS) have the power to identify variants that capture significant levels of phenotypic variance in complex traits. However, effort and time are required to select the best methods and optimize parameters and pre-processing steps. Although machine learning approaches have been shown to greatly assist in optimization and data processing, applying them to QTL analysis and GWAS is challenging due to the complexity of large, heterogenous datasets. Here, we describe proof-of-concept for an automated machine learning approach, AutoQTL, with the ability to automate many complex decisions related to analysis of complex traits and generate diverse solutions to describe relationships that exist in genetic data.

RESULTS

Using a dataset of 18 putative QTL from a large-scale GWAS of body mass index in the laboratory rat, , AutoQTL captures the phenotypic variance explained under a standard additive model while also providing evidence of non-additive effects including deviations from additivity and 2-way epistatic interactions from simulated data via multiple optimal solutions. Additionally, feature importance metrics provide different insights into the inheritance models and predictive power of multiple GWAS-derived putative QTL.

CONCLUSIONS

This proof-of-concept illustrates that automated machine learning techniques can be applied to genetic data and has the potential to detect both additive and non-additive effects via various optimal solutions and feature importance metrics. In the future, we aim to expand AutoQTL to accommodate omics-level datasets with intelligent feature selection strategies.

摘要

背景

数量性状基因座（QTL）分析和全基因组关联研究（GWAS）有能力识别在复杂性状中捕获显著表型变异水平的变异。然而，需要花费精力和时间来选择最佳方法、优化参数和预处理步骤。尽管机器学习方法已被证明能极大地辅助优化和数据处理，但由于大型异质数据集的复杂性，将其应用于QTL分析和GWAS具有挑战性。在此，我们描述了一种自动化机器学习方法AutoQTL的概念验证，它能够自动做出许多与复杂性状分析相关的复杂决策，并生成多种解决方案来描述遗传数据中存在的关系。

结果

使用来自实验室大鼠体重指数大规模GWAS的18个假定QTL的数据集，AutoQTL捕获了标准加性模型下解释的表型变异，同时还通过多个最优解提供了非加性效应的证据，包括来自模拟数据的加性偏差和双向上位性相互作用。此外，特征重要性指标为多个GWAS衍生的假定QTL的遗传模型和预测能力提供了不同的见解。

结论

这一概念验证表明，自动化机器学习技术可应用于遗传数据，并且有潜力通过各种最优解和特征重要性指标检测加性和非加性效应。未来，我们旨在扩展AutoQTL，以通过智能特征选择策略适应组学水平的数据集。

相似文献

1

Automated quantitative trait locus analysis (AutoQTL).自动数量性状基因座分析（AutoQTL）。

bioRxiv. 2023 Jan 13:2023.01.12.523835. doi: 10.1101/2023.01.12.523835.

2

Automated quantitative trait locus analysis (AutoQTL).自动数量性状基因座分析（AutoQTL）。

BioData Min. 2023 Apr 10;16(1):14. doi: 10.1186/s13040-023-00331-3.

3

PAGER: A novel genotype encoding strategy for modeling deviations from additivity in complex trait association studies.PAGER：一种用于在复杂性状关联研究中对加性偏差进行建模的新型基因型编码策略。

BioData Min. 2024 Oct 11;17(1):41. doi: 10.1186/s13040-024-00393-x.

4

A Novel Mapping Strategy Utilizing Mouse Chromosome Substitution Strains Identifies Multiple Epistatic Interactions That Regulate Complex Traits.一种利用小鼠染色体代换系的新型定位策略鉴定出多个调控复杂性状的上位性相互作用。

G3 (Bethesda). 2020 Dec 3;10(12):4553-4563. doi: 10.1534/g3.120.401824.

5

Barcoded bulk QTL mapping reveals highly polygenic and epistatic architecture of complex traits in yeast.条码化 bulk QTL 作图揭示了酵母中复杂性状的高度多基因和上位性结构。

Elife. 2022 Feb 11;11:e73983. doi: 10.7554/eLife.73983.

6

Mapping small-effect and linked quantitative trait loci for complex traits in backcross or DH populations via a multi-locus GWAS methodology.通过多基因 GWAS 方法在回交或 DH 群体中定位复杂性状的小效应和连锁数量性状基因座。

Sci Rep. 2016 Jul 20;6:29951. doi: 10.1038/srep29951.

7

Genome wide association analysis of the 16th QTL- MAS Workshop dataset using the Random Forest machine learning approach.使用随机森林机器学习方法对第16届QTL-MAS研讨会数据集进行全基因组关联分析。

BMC Proc. 2014 Oct 7;8(Suppl 5):S4. doi: 10.1186/1753-6561-8-S5-S4. eCollection 2014.

8

An efficient unified model for genome-wide association studies and genomic selection.一种用于全基因组关联研究和基因组选择的高效统一模型。

Genet Sel Evol. 2017 Aug 24;49(1):64. doi: 10.1186/s12711-017-0338-x.

9

Assessing the value of phenotypic information from non-genotyped animals for QTL mapping of complex traits in real and simulated populations.评估来自非基因分型动物的表型信息在真实和模拟群体中对复杂性状进行QTL定位的价值。

BMC Genet. 2016 Jun 21;17(1):89. doi: 10.1186/s12863-016-0394-1.

10

The power of regional heritability analysis for rare and common variant detection: simulations and application to eye biometrical traits.区域遗传力分析在稀有和常见变异检测中的作用：模拟与眼部生物测量特征分析。

Front Genet. 2013 Nov 19;4:232. doi: 10.3389/fgene.2013.00232. eCollection 2013.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验