• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

加权是最困难的部分:关于序列样本中数据驱动加权方案下似然比检验和得分检验的行为

The Weighting is the Hardest Part: On the Behavior of the Likelihood Ratio Test and the Score Test Under a Data-Driven Weighting Scheme in Sequenced Samples.

作者信息

Minică Camelia C, Genovese Giulio, Hultman Christina M, Pool René, Vink Jacqueline M, Neale Michael C, Dolan Conor V, Neale Benjamin M

机构信息

Department of Biological Psychology,Vrije Universiteit,Amsterdam,The Netherlands.

The Stanley Center for Psychiatric Research,Broad Institute of the Massachusetts Institute of Technology and Harvard,Cambridge,MA.

出版信息

Twin Res Hum Genet. 2017 Apr;20(2):108-118. doi: 10.1017/thg.2017.7. Epub 2017 Feb 27.

DOI:10.1017/thg.2017.7
PMID:28238293
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5357183/
Abstract

Sequence-based association studies are at a critical inflexion point with the increasing availability of exome-sequencing data. A popular test of association is the sequence kernel association test (SKAT). Weights are embedded within SKAT to reflect the hypothesized contribution of the variants to the trait variance. Because the true weights are generally unknown, and so are subject to misspecification, we examined the efficiency of a data-driven weighting scheme. We propose the use of a set of theoretically defensible weighting schemes, of which, we assume, the one that gives the largest test statistic is likely to capture best the allele frequency-functional effect relationship. We show that the use of alternative weights obviates the need to impose arbitrary frequency thresholds. As both the score test and the likelihood ratio test (LRT) may be used in this context, and may differ in power, we characterize the behavior of both tests. The two tests have equal power, if the weights in the set included weights resembling the correct ones. However, if the weights are badly specified, the LRT shows superior power (due to its robustness to misspecification). With this data-driven weighting procedure the LRT detected significant signal in genes located in regions already confirmed as associated with schizophrenia - the PRRC2A (p = 1.020e-06) and the VARS2 (p = 2.383e-06) - in the Swedish schizophrenia case-control cohort of 11,040 individuals with exome-sequencing data. The score test is currently preferred for its computational efficiency and power. Indeed, assuming correct specification, in some circumstances, the score test is the most powerful test. However, LRT has the advantageous properties of being generally more robust and more powerful under weight misspecification. This is an important result given that, arguably, misspecified models are likely to be the rule rather than the exception in weighting-based approaches.

摘要

随着外显子组测序数据越来越容易获取,基于序列的关联研究正处于一个关键的转折点。一种常用的关联检验方法是序列核关联检验(SKAT)。SKAT中嵌入了权重,以反映变异对性状变异的假设贡献。由于真实权重通常是未知的,因此容易出现错误设定,我们研究了一种数据驱动加权方案的效率。我们提出使用一组理论上合理的加权方案,我们认为,给出最大检验统计量的那个方案可能最能捕捉等位基因频率-功能效应关系。我们表明,使用替代权重无需设定任意频率阈值。由于在此背景下既可以使用得分检验也可以使用似然比检验(LRT),且二者的功效可能不同,我们对这两种检验的行为进行了刻画。如果集合中的权重包含类似于正确权重的权重,则这两种检验具有相同的功效。然而,如果权重设定不当,LRT显示出更高的功效(由于其对错误设定的稳健性)。通过这种数据驱动的加权程序,在瑞典11,040名有外显子组测序数据的精神分裂症病例对照队列中,LRT在已被确认为与精神分裂症相关的区域中的基因——PRRC2A(p = 1.020e - 06)和VARS2(p = 2.383e - 06)中检测到了显著信号。目前,得分检验因其计算效率和功效而更受青睐。实际上,假设设定正确,在某些情况下,得分检验是最具功效的检验。然而,LRT具有通常更稳健且在权重错误设定下更具功效的优势特性。鉴于在基于加权的方法中,错误设定的模型很可能是常态而非例外,这是一个重要的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2e7/5357183/4ecdf944355b/nihms844022f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2e7/5357183/4ecdf944355b/nihms844022f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c2e7/5357183/4ecdf944355b/nihms844022f1.jpg

相似文献

1
The Weighting is the Hardest Part: On the Behavior of the Likelihood Ratio Test and the Score Test Under a Data-Driven Weighting Scheme in Sequenced Samples.加权是最困难的部分:关于序列样本中数据驱动加权方案下似然比检验和得分检验的行为
Twin Res Hum Genet. 2017 Apr;20(2):108-118. doi: 10.1017/thg.2017.7. Epub 2017 Feb 27.
2
Likelihood ratio tests in rare variant detection for continuous phenotypes.连续型表型罕见变异检测中的似然比检验。
Ann Hum Genet. 2014 Sep;78(5):320-32. doi: 10.1111/ahg.12071.
3
A gene based combination test using GWAS summary data.基于 GWAS 汇总数据的基因组合测试。
BMC Bioinformatics. 2023 Jan 3;24(1):2. doi: 10.1186/s12859-022-05114-x.
4
Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies.最优统一方法用于罕见变异关联测试及其在小样本病例对照全外显子测序研究中的应用。
Am J Hum Genet. 2012 Aug 10;91(2):224-37. doi: 10.1016/j.ajhg.2012.06.007. Epub 2012 Aug 2.
5
Gene-based genetic association test with adaptive optimal weights.基于基因的遗传关联测试与自适应最优权重。
Genet Epidemiol. 2018 Feb;42(1):95-103. doi: 10.1002/gepi.22098. Epub 2017 Nov 26.
6
Evaluating the Calibration and Power of Three Gene-Based Association Tests of Rare Variants for the X Chromosome.评估X染色体上三种基于基因的罕见变异关联测试的校准度和效能。
Genet Epidemiol. 2015 Nov;39(7):499-508. doi: 10.1002/gepi.21935. Epub 2015 Oct 10.
7
On Robust Association Testing for Quantitative Traits and Rare Variants.关于数量性状和罕见变异的稳健关联测试
G3 (Bethesda). 2016 Dec 7;6(12):3941-3950. doi: 10.1534/g3.116.035485.
8
Rare-variant association testing for sequencing data with the sequence kernel association test.基于序列核关联检验的测序数据罕见变异关联分析
Am J Hum Genet. 2011 Jul 15;89(1):82-93. doi: 10.1016/j.ajhg.2011.05.029. Epub 2011 Jul 7.
9
Generalized functional linear models for gene-based case-control association studies.用于基于基因的病例对照关联研究的广义功能线性模型。
Genet Epidemiol. 2014 Nov;38(7):622-637. doi: 10.1002/gepi.21840. Epub 2014 Sep 9.
10
Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions.基于固定效应函数回归的删失性状基因关联分析
Genet Epidemiol. 2016 Feb;40(2):133-43. doi: 10.1002/gepi.21947. Epub 2016 Jan 18.

引用本文的文献

1
Recent advances and challenges of rare variant association analysis in the biobank sequencing era.生物样本库测序时代罕见变异关联分析的最新进展与挑战
Front Genet. 2022 Oct 6;13:1014947. doi: 10.3389/fgene.2022.1014947. eCollection 2022.
2
Epigenetic signatures relating to disease-associated genotypic burden in familial risk of bipolar disorder.与双相情感障碍家族风险相关的疾病相关基因型负担的表观遗传特征。
Transl Psychiatry. 2022 Aug 3;12(1):310. doi: 10.1038/s41398-022-02079-6.
3
Convex combination sequence kernel association test for rare-variant studies.凸组合序列核关联检验在罕见变异研究中的应用。
Genet Epidemiol. 2020 Jun;44(4):352-367. doi: 10.1002/gepi.22287. Epub 2020 Feb 26.
4
Unified Sequence-Based Association Tests Allowing for Multiple Functional Annotations and Meta-analysis of Noncoding Variation in Metabochip Data.基于统一序列的关联测试,支持多种功能注释以及代谢芯片数据中非编码变异的荟萃分析。
Am J Hum Genet. 2017 Sep 7;101(3):340-352. doi: 10.1016/j.ajhg.2017.07.011. Epub 2017 Aug 24.

本文引用的文献

1
Increased burden of ultra-rare protein-altering variants among 4,877 individuals with schizophrenia.4877名精神分裂症患者中极罕见的蛋白质改变变体负担增加。
Nat Neurosci. 2016 Nov;19(11):1433-1441. doi: 10.1038/nn.4402. Epub 2016 Oct 3.
2
Meta-analysis for Discovering Rare-Variant Associations: Statistical Methods and Software Programs.用于发现罕见变异关联的荟萃分析:统计方法与软件程序
Am J Hum Genet. 2015 Jul 2;97(1):35-53. doi: 10.1016/j.ajhg.2015.05.001. Epub 2015 Jun 18.
3
OpenMx 2.0: Extended Structural Equation and Statistical Modeling.OpenMx 2.0:扩展结构方程与统计建模
Psychometrika. 2016 Jun;81(2):535-49. doi: 10.1007/s11336-014-9435-8. Epub 2015 Jan 27.
4
From FastQ data to high confidence variant calls: the Genome Analysis Toolkit best practices pipeline.从FastQ数据到高可信度变异检测:基因组分析工具包最佳实践流程
Curr Protoc Bioinformatics. 2013;43(1110):11.10.1-11.10.33. doi: 10.1002/0471250953.bi1110s43.
5
Cosi2: an efficient simulator of exact and approximate coalescent with selection.Cosi2:一种用于精确和近似选择合并的高效模拟器。
Bioinformatics. 2014 Dec 1;30(23):3427-9. doi: 10.1093/bioinformatics/btu562. Epub 2014 Aug 22.
6
Likelihood ratio tests in rare variant detection for continuous phenotypes.连续型表型罕见变异检测中的似然比检验。
Ann Hum Genet. 2014 Sep;78(5):320-32. doi: 10.1111/ahg.12071.
7
A framework for the interpretation of de novo mutation in human disease.一种人类疾病中新生突变的解读框架。
Nat Genet. 2014 Sep;46(9):944-50. doi: 10.1038/ng.3050. Epub 2014 Aug 3.
8
Greater power and computational efficiency for kernel-based association testing of sets of genetic variants.基于核的遗传变异集关联测试的更大的能力和计算效率。
Bioinformatics. 2014 Nov 15;30(22):3206-14. doi: 10.1093/bioinformatics/btu504. Epub 2014 Jul 29.
9
Biological insights from 108 schizophrenia-associated genetic loci.108 个精神分裂症相关遗传位点的生物学见解。
Nature. 2014 Jul 24;511(7510):421-7. doi: 10.1038/nature13595. Epub 2014 Jul 22.
10
FFBSKAT: fast family-based sequence kernel association test.FFBSKAT:基于家系的快速序列核关联检验
PLoS One. 2014 Jun 6;9(6):e99407. doi: 10.1371/journal.pone.0099407. eCollection 2014.