• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

自动数量性状基因座分析(AutoQTL)。

Automated quantitative trait locus analysis (AutoQTL).

作者信息

Freda Philip J, Ghosh Attri, Zhang Elizabeth, Luo Tianhao, Chitre Apurva S, Polesskaya Oksana, St Pierre Celine L, Gao Jianjun, Martin Connor D, Chen Hao, Garcia-Martinez Angel G, Wang Tengfei, Han Wenyan, Ishiwari Keita, Meyer Paul, Lamparelli Alexander, King Christopher P, Palmer Abraham A, Li Ruowang, Moore Jason H

机构信息

Department of Computational Biomedicine, Cedars-Sinai Medical Center, 700 N. San Vicente Blvd., Pacific Design Center, Suite G540, West Hollywood, CA, 90069, USA.

Department of Psychiatry, University of California San Diego, 9500 Gilman Dr., Mail Code: 0667, La Jolla, CA, 92093-0667, USA.

出版信息

BioData Min. 2023 Apr 10;16(1):14. doi: 10.1186/s13040-023-00331-3.

DOI:10.1186/s13040-023-00331-3
PMID:37038201
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10088184/
Abstract

BACKGROUND

Quantitative Trait Locus (QTL) analysis and Genome-Wide Association Studies (GWAS) have the power to identify variants that capture significant levels of phenotypic variance in complex traits. However, effort and time are required to select the best methods and optimize parameters and pre-processing steps. Although machine learning approaches have been shown to greatly assist in optimization and data processing, applying them to QTL analysis and GWAS is challenging due to the complexity of large, heterogenous datasets. Here, we describe proof-of-concept for an automated machine learning approach, AutoQTL, with the ability to automate many complicated decisions related to analysis of complex traits and generate solutions to describe relationships that exist in genetic data.

RESULTS

Using a publicly available dataset of 18 putative QTL from a large-scale GWAS of body mass index in the laboratory rat, Rattus norvegicus, AutoQTL captures the phenotypic variance explained under a standard additive model. AutoQTL also detects evidence of non-additive effects including deviations from additivity and 2-way epistatic interactions in simulated data via multiple optimal solutions. Additionally, feature importance metrics provide different insights into the inheritance models and predictive power of multiple GWAS-derived putative QTL.

CONCLUSIONS

This proof-of-concept illustrates that automated machine learning techniques can complement standard approaches and have the potential to detect both additive and non-additive effects via various optimal solutions and feature importance metrics. In the future, we aim to expand AutoQTL to accommodate omics-level datasets with intelligent feature selection and feature engineering strategies.

摘要

背景

数量性状基因座(QTL)分析和全基因组关联研究(GWAS)有能力识别在复杂性状中捕获显著表型变异水平的变异。然而,选择最佳方法、优化参数和预处理步骤需要付出努力和时间。尽管机器学习方法已被证明能极大地协助优化和数据处理,但由于大型异质数据集的复杂性,将其应用于QTL分析和GWAS具有挑战性。在这里,我们描述了一种自动化机器学习方法AutoQTL的概念验证,它能够自动做出许多与复杂性状分析相关的复杂决策,并生成解决方案来描述遗传数据中存在的关系。

结果

使用来自实验室大鼠褐家鼠体重指数大规模GWAS的18个假定QTL的公开可用数据集,AutoQTL捕获了标准加性模型下解释的表型变异。AutoQTL还通过多个最优解检测到非加性效应的证据,包括模拟数据中与加性的偏差和双向上位性相互作用。此外,特征重要性指标为多个GWAS衍生的假定QTL的遗传模型和预测能力提供了不同的见解。

结论

这一概念验证表明,自动化机器学习技术可以补充标准方法,并有潜力通过各种最优解和特征重要性指标检测加性和非加性效应。未来,我们旨在扩展AutoQTL,以通过智能特征选择和特征工程策略来适应组学水平的数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00a8/10088184/158f5a1b6b37/13040_2023_331_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00a8/10088184/7d63f991fa54/13040_2023_331_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00a8/10088184/6d968604ab70/13040_2023_331_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00a8/10088184/261e0ff3e72f/13040_2023_331_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00a8/10088184/fd78c23ca202/13040_2023_331_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00a8/10088184/158f5a1b6b37/13040_2023_331_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00a8/10088184/7d63f991fa54/13040_2023_331_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00a8/10088184/6d968604ab70/13040_2023_331_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00a8/10088184/261e0ff3e72f/13040_2023_331_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00a8/10088184/fd78c23ca202/13040_2023_331_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/00a8/10088184/158f5a1b6b37/13040_2023_331_Fig5_HTML.jpg

相似文献

1
Automated quantitative trait locus analysis (AutoQTL).自动数量性状基因座分析(AutoQTL)。
BioData Min. 2023 Apr 10;16(1):14. doi: 10.1186/s13040-023-00331-3.
2
Automated quantitative trait locus analysis (AutoQTL).自动数量性状基因座分析(AutoQTL)。
bioRxiv. 2023 Jan 13:2023.01.12.523835. doi: 10.1101/2023.01.12.523835.
3
PAGER: A novel genotype encoding strategy for modeling deviations from additivity in complex trait association studies.PAGER:一种用于在复杂性状关联研究中对加性偏差进行建模的新型基因型编码策略。
BioData Min. 2024 Oct 11;17(1):41. doi: 10.1186/s13040-024-00393-x.
4
A Novel Mapping Strategy Utilizing Mouse Chromosome Substitution Strains Identifies Multiple Epistatic Interactions That Regulate Complex Traits.一种利用小鼠染色体代换系的新型定位策略鉴定出多个调控复杂性状的上位性相互作用。
G3 (Bethesda). 2020 Dec 3;10(12):4553-4563. doi: 10.1534/g3.120.401824.
5
Barcoded bulk QTL mapping reveals highly polygenic and epistatic architecture of complex traits in yeast.条码化 bulk QTL 作图揭示了酵母中复杂性状的高度多基因和上位性结构。
Elife. 2022 Feb 11;11:e73983. doi: 10.7554/eLife.73983.
6
Mapping quantitative trait loci by controlling polygenic background effects.通过控制多基因背景效应进行数量性状基因座定位。
Genetics. 2013 Dec;195(4):1209-22. doi: 10.1534/genetics.113.157032. Epub 2013 Sep 27.
7
Epistatic analysis of carcass characteristics in pigs reveals genomic interactions between quantitative trait loci attributable to additive and dominance genetic effects.猪胴体特征的上位性分析揭示了归因于加性和显性遗传效应的数量性状基因座之间的基因组相互作用。
J Anim Sci. 2010 Jul;88(7):2219-34. doi: 10.2527/jas.2009-2266. Epub 2010 Mar 12.
8
Parallel and serial computing tools for testing single-locus and epistatic SNP effects of quantitative traits in genome-wide association studies.用于在全基因组关联研究中测试数量性状的单基因座和上位性SNP效应的并行和串行计算工具。
BMC Bioinformatics. 2008 Jul 21;9:315. doi: 10.1186/1471-2105-9-315.
9
Including non-additive genetic effects in Bayesian methods for the prediction of genetic values based on genome-wide markers.基于全基因组标记物,在贝叶斯方法预测遗传值时纳入非加性遗传效应。
BMC Genet. 2011 Aug 25;12:74. doi: 10.1186/1471-2156-12-74.
10
Identifying quantitative trait locus by genetic background interactions in association studies.在关联研究中通过遗传背景相互作用鉴定数量性状基因座。
Genetics. 2007 May;176(1):553-61. doi: 10.1534/genetics.106.062992. Epub 2006 Dec 18.

引用本文的文献

1
The tree-based pipeline optimization tool: Tackling biomedical research problems with genetic programming and automated machine learning.基于树的管道优化工具:用遗传编程和自动化机器学习解决生物医学研究问题。
Patterns (N Y). 2025 Jul 11;6(7):101314. doi: 10.1016/j.patter.2025.101314.
2
Leveraging artificial intelligence and machine learning to accelerate discovery of disease-modifying therapies in type 1 diabetes.利用人工智能和机器学习加速1型糖尿病疾病修饰疗法的发现。
Diabetologia. 2025 Mar;68(3):477-494. doi: 10.1007/s00125-024-06339-6. Epub 2024 Dec 19.
3
Identification of quantitative trait loci associated with leaf rust resistance in rye by precision mapping.

本文引用的文献

1
The interplay of additivity, dominance, and epistasis on fitness in a diploid yeast cross.在二倍体酵母杂交中,加性、显性和上位性对适合度的相互作用。
Nat Commun. 2022 Mar 18;13(1):1463. doi: 10.1038/s41467-022-29111-z.
2
Genetic Analysis of Coronary Artery Disease Using Tree-Based Automated Machine Learning Informed By Biology-Based Feature Selection.基于生物学特征选择的树状自动化机器学习在冠状动脉疾病遗传分析中的应用。
IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1379-1386. doi: 10.1109/TCBB.2021.3099068. Epub 2022 Jun 3.
3
A comparison of methods for interpreting random forest models of genetic association in the presence of non-additive interactions.
通过精确作图鉴定与黑麦抗叶锈病相关的数量性状位点。
BMC Plant Biol. 2024 Apr 17;24(1):291. doi: 10.1186/s12870-024-04960-6.
4
Interaction models matter: an efficient, flexible computational framework for model-specific investigation of epistasis.交互作用模型很重要:一个用于特定模型上位性研究的高效、灵活的计算框架。
BioData Min. 2024 Feb 28;17(1):7. doi: 10.1186/s13040-024-00358-0.
存在非加性相互作用时遗传关联随机森林模型解释方法的比较
BioData Min. 2021 Jan 29;14(1):9. doi: 10.1186/s13040-021-00243-0.
4
Genome Wide Epistasis Study of On-Statin Cardiovascular Events with Iterative Feature Reduction and Selection.使用迭代特征约简与选择对他汀类药物相关心血管事件进行全基因组上位性研究
J Pers Med. 2020 Nov 7;10(4):212. doi: 10.3390/jpm10040212.
5
Embedding covariate adjustments in tree-based automated machine learning for biomedical big data analyses.基于树的自动化机器学习中嵌入协变量调整,用于生物医学大数据分析。
BMC Bioinformatics. 2020 Oct 1;21(1):430. doi: 10.1186/s12859-020-03755-4.
6
Genome-Wide Association Study in 3,173 Outbred Rats Identifies Multiple Loci for Body Weight, Adiposity, and Fasting Glucose.在 3173 只杂交大鼠中进行的全基因组关联研究确定了多个与体重、体脂肪和空腹血糖相关的基因座。
Obesity (Silver Spring). 2020 Oct;28(10):1964-1973. doi: 10.1002/oby.22927. Epub 2020 Aug 29.
7
Model selection for metabolomics: predicting diagnosis of coronary artery disease using automated machine learning.代谢组学模型选择:使用自动化机器学习预测冠心病的诊断。
Bioinformatics. 2020 Mar 1;36(6):1772-1778. doi: 10.1093/bioinformatics/btz796.
8
Scaling tree-based automated machine learning to biomedical big data with a feature set selector.使用特征集选择器将基于树的自动化机器学习扩展到生物医学大数据。
Bioinformatics. 2020 Jan 1;36(1):250-256. doi: 10.1093/bioinformatics/btz470.
9
Considerations for automated machine learning in clinical metabolic profiling: Altered homocysteine plasma concentration associated with metformin exposure.临床代谢谱分析中自动化机器学习的考量:与二甲双胍暴露相关的血浆同型半胱氨酸浓度改变。
Pac Symp Biocomput. 2018;23:460-471.
10
Annotating pathogenic non-coding variants in genic regions.注释基因区域中的致病性非编码变异体。
Nat Commun. 2017 Aug 9;8(1):236. doi: 10.1038/s41467-017-00141-2.