用于多位点关联分析的基于广义基因组距离的回归方法。

Generalized genomic distance-based regression methodology for multilocus association analysis.

作者信息

Wessel Jennifer, Schork Nicholas J

机构信息

Polymorphism Research Laboratory, Department of Psychiatry, Divisions of Epidemiology, Center for Human Genetics and Genomics, University of California at San Diego, La Jolla, CA 92093-0603, USA.

出版信息

Am J Hum Genet. 2006 Nov;79(5):792-806. doi: 10.1086/508346. Epub 2006 Sep 21.

DOI:10.1086/508346

PMID:17033957

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1698575/

Abstract

Large-scale, multilocus genetic association studies require powerful and appropriate statistical-analysis tools that are designed to relate genotype and haplotype information to phenotypes of interest. Many analysis approaches consider relating allelic, haplotypic, or genotypic information to a trait through use of extensions of traditional analysis techniques, such as contingency-table analysis, regression methods, and analysis-of-variance techniques. In this work, we consider a complementary approach that involves the characterization and measurement of the similarity and dissimilarity of the allelic composition of a set of individuals' diploid genomes at multiple loci in the regions of interest. We describe a regression method that can be used to relate variation in the measure of genomic dissimilarity (or "distance") among a set of individuals to variation in their trait values. Weighting factors associated with functional or evolutionary conservation information of the loci can be used in the assessment of similarity. The proposed method is very flexible and is easily extended to complex multilocus-analysis settings involving covariates. In addition, the proposed method actually encompasses both single-locus and haplotype-phylogeny analysis methods, which are two of the most widely used approaches in genetic association analysis. We showcase the method with data described in the literature. Ultimately, our method is appropriate for high-dimensional genomic data and anticipates an era when cost-effective exhaustive DNA sequence data can be obtained for a large number of individuals, over and above genotype information focused on a few well-chosen loci.

摘要

大规模、多位点基因关联研究需要强大且合适的统计分析工具，这些工具旨在将基因型和单倍型信息与感兴趣的表型联系起来。许多分析方法考虑通过使用传统分析技术的扩展，如列联表分析、回归方法和方差分析技术，将等位基因、单倍型或基因型信息与性状联系起来。在这项工作中，我们考虑一种互补的方法，该方法涉及对感兴趣区域内多个位点上一组个体的二倍体基因组等位基因组成的相似性和差异性进行表征和测量。我们描述了一种回归方法，可用于将一组个体之间基因组差异（或“距离”）测量值的变化与其性状值的变化联系起来。与位点的功能或进化保守信息相关的加权因子可用于相似性评估。所提出的方法非常灵活，并且很容易扩展到涉及协变量的复杂多位点分析设置。此外，所提出的方法实际上涵盖了单一位点和单倍型系统发育分析方法，这是基因关联分析中使用最广泛的两种方法。我们用文献中描述的数据展示了该方法。最终，我们的方法适用于高维基因组数据，并预示着一个时代的到来，那时除了关注少数精心挑选位点的基因型信息外，还可以为大量个体获得具有成本效益的详尽DNA序列数据。

相似文献

Generalized genomic distance-based regression methodology for multilocus association analysis.用于多位点关联分析的基于广义基因组距离的回归方法。

Am J Hum Genet. 2006 Nov;79(5):792-806. doi: 10.1086/508346. Epub 2006 Sep 21.

A regression-based association test for case-control studies that uses inferred ancestral haplotype similarity.一种用于病例对照研究的基于回归的关联检验，该检验使用推断的祖先单倍型相似性。

Ann Hum Genet. 2009 Sep;73(Pt 5):520-6. doi: 10.1111/j.1469-1809.2009.00536.x. Epub 2009 Jul 20.

On the simultaneous association analysis of large genomic regions: a massive multi-locus association test.同时关联分析大型基因组区域：大规模多基因座关联测试。

Bioinformatics. 2014 Jan 15;30(2):157-64. doi: 10.1093/bioinformatics/btt654. Epub 2013 Nov 20.

How powerful are summary-based methods for identifying expression-trait associations under different genetic architectures?基于汇总数据的方法在不同遗传结构下识别表达性状关联的能力有多强？

Pac Symp Biocomput. 2018;23:228-239.

Power comparisons between similarity-based multilocus association methods, logistic regression, and score tests for haplotypes.基于相似性的多位点关联方法、逻辑回归和单倍型得分检验之间的功效比较。

Genet Epidemiol. 2009 Apr;33(3):183-97. doi: 10.1002/gepi.20364.

Incorporating single-locus tests into haplotype cladistic analysis in case-control studies.在病例对照研究中，将单基因座检验纳入单倍型分支分析。

PLoS Genet. 2007 Mar 23;3(3):e46. doi: 10.1371/journal.pgen.0030046.

Haplotype function score improves biological interpretation and cross-ancestry polygenic prediction of human complex traits.单体型功能评分可改善人类复杂性状的生物学解释和跨血统多基因预测。

Elife. 2024 Apr 19;12:RP92574. doi: 10.7554/eLife.92574.

SNPs, haplotypes, and model selection in a candidate gene region: the SIMPle analysis for multilocus data.候选基因区域中的单核苷酸多态性、单倍型及模型选择：多位点数据的简单分析

Genet Epidemiol. 2004 Dec;27(4):429-41. doi: 10.1002/gepi.20039.

PolyLens: software for map-based visualisation and analysis of genome-scale polymorphism data.PolyLens：用于基于图谱的基因组规模多态性数据可视化和分析的软件。

Int J Comput Biol Drug Des. 2013;6(1-2):93-106. doi: 10.1504/IJCBDD.2013.052204. Epub 2013 Feb 21.

Identifying genes associated with a quantitative trait or quantitative trait locus via selective transcriptional profiling.通过选择性转录谱分析鉴定与数量性状或数量性状基因座相关的基因。

Biometrics. 2006 Jun;62(2):504-14. doi: 10.1111/j.1541-0420.2005.00491.x.

引用本文的文献

Joint analysis of phenotypic and molecular data for genetic diversity assessment in extra-early orange maize (Zea Mays L.).联合分析表型和分子数据以评估特早熟橙色玉米（Zea Mays L.）的遗传多样性

BMC Genomics. 2025 Aug 28;26(1):784. doi: 10.1186/s12864-025-11964-5.

BayesKAT: bayesian optimal kernel-based test for genetic association studies reveals joint genetic effects in complex diseases.贝叶斯KAT：用于基因关联研究的基于贝叶斯最优核的检验揭示复杂疾病中的联合基因效应。

Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae182.

Excalibur: A new ensemble method based on an optimal combination of aggregation tests for rare-variant association testing for sequencing data.Excalibur：一种新的基于聚合检验最优组合的测序数据罕见变异关联检验的集成方法。

PLoS Comput Biol. 2023 Sep 14;19(9):e1011488. doi: 10.1371/journal.pcbi.1011488. eCollection 2023 Sep.

Genotype Value Decomposition: Simple Methods for the Computation of Kernel Statistics.基因型值分解：内核统计计算的简单方法

Adv Genet (Hoboken). 2022 Apr 5;3(3):2100066. doi: 10.1002/ggn2.202100066. eCollection 2022 Sep.

Interrogating the Human Diplome: Computational Methods, Emerging Applications, and Challenges.人类 diplome 的探究：计算方法、新兴应用和挑战。

Methods Mol Biol. 2023;2590:1-30. doi: 10.1007/978-1-0716-2819-5_1.

A machine learning-based SNP-set analysis approach for identifying disease-associated susceptibility loci.基于机器学习的 SNP 集分析方法，用于鉴定与疾病相关的易感性基因座。

Sci Rep. 2022 Sep 22;12(1):15817. doi: 10.1038/s41598-022-19708-1.

Distance-Based Analysis with Quantile Regression Models.基于分位数回归模型的距离分析

Stat Biosci. 2021 Jul;13(2):291-312. doi: 10.1007/s12561-021-09306-6. Epub 2021 Mar 27.

Simulation Research on the Methods of Multi-Gene Region Association Analysis Based on a Functional Linear Model.基于函数线性模型的多基因区域关联分析方法的仿真研究。

Genes (Basel). 2022 Mar 2;13(3):455. doi: 10.3390/genes13030455.

A Multi-Marker Test for Analyzing Paired Genetic Data in Transplantation.一种用于分析移植中配对遗传数据的多标记测试。

Front Genet. 2021 Oct 13;12:745773. doi: 10.3389/fgene.2021.745773. eCollection 2021.

SPARK-X: non-parametric modeling enables scalable and robust detection of spatial expression patterns for large spatial transcriptomic studies.SPARK-X：非参数建模可实现大规模空间转录组学研究中空间表达模式的可扩展和稳健检测。

Genome Biol. 2021 Jun 21;22(1):184. doi: 10.1186/s13059-021-02404-0.

本文引用的文献

Arlequin (version 3.0): an integrated software package for population genetics data analysis.Arlequin（版本 3.0）：一个用于群体遗传学数据分析的集成软件包。

Evol Bioinform Online. 2007 Feb 23;1:47-50.

Online Mendelian Inheritance in Man 'OMIM'.《人类孟德尔遗传在线》（OMIM）。

Indian J Dermatol Venereol Leprol. 2003 Nov-Dec;69(6):423-4.

Diplotype trend regression analysis of the ADH gene cluster and the ALDH2 gene: multiple significant associations with alcohol dependence.乙醇脱氢酶基因簇和乙醛脱氢酶2基因的双倍型趋势回归分析：与酒精依赖的多重显著关联

Am J Hum Genet. 2006 Jun;78(6):973-87. doi: 10.1086/504113. Epub 2006 Apr 11.

Fine-scale mapping in case-control samples using locus scoring and haplotype-sharing methods.基于定位评分和单体型共享方法的病例对照样本精细定位研究。

BMC Genet. 2005 Dec 30;6 Suppl 1(Suppl 1):S74. doi: 10.1186/1471-2156-6-S1-S74.

Toward constructing an endophenotype strategy for bipolar disorders.迈向构建双相情感障碍的内表型策略。

Biol Psychiatry. 2006 Jul 15;60(2):93-105. doi: 10.1016/j.biopsych.2005.11.006. Epub 2006 Jan 9.

A unified mixed-model method for association mapping that accounts for multiple levels of relatedness.一种用于关联映射的统一混合模型方法，该方法考虑了多个相关水平。

Nat Genet. 2006 Feb;38(2):203-8. doi: 10.1038/ng1702. Epub 2005 Dec 25.

Clustering of haplotypes based on phylogeny: how good a strategy for association testing?基于系统发育的单倍型聚类：关联测试的策略有多好？

Eur J Hum Genet. 2006 Feb;14(2):202-6. doi: 10.1038/sj.ejhg.5201501.

The genetics of depression: a review.抑郁症的遗传学：综述

Biol Psychiatry. 2006 Jul 15;60(2):84-92. doi: 10.1016/j.biopsych.2005.08.024. Epub 2005 Nov 21.

A haplotype map of the human genome.人类基因组单倍型图谱。

Nature. 2005 Oct 27;437(7063):1299-320. doi: 10.1038/nature04226.

Mapping determinants of human gene expression by regional and genome-wide association.通过区域和全基因组关联研究绘制人类基因表达的决定因素。

Nature. 2005 Oct 27;437(7063):1365-9. doi: 10.1038/nature04244.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验