• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

关于极端表型抽样设计下群体分层效应的警示说明。

A Cautionary Note on the Effects of Population Stratification Under an Extreme Phenotype Sampling Design.

作者信息

Panarella Michela, Burkett Kelly M

机构信息

Department of Biology, University of Ottawa, Ottawa, ON, Canada.

Department of Mathematics and Statistics, University of Ottawa, Ottawa, ON, Canada.

出版信息

Front Genet. 2019 May 3;10:398. doi: 10.3389/fgene.2019.00398. eCollection 2019.

DOI:10.3389/fgene.2019.00398
PMID:31130982
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6509877/
Abstract

Extreme phenotype sampling (EPS) is a popular study design used to reduce genotyping or sequencing costs. Assuming continuous phenotype data are available on a large cohort, EPS involves genotyping or sequencing only those individuals with extreme phenotypic values. Although this design has been shown to have high power to detect genetic effects even at smaller sample sizes, little attention has been paid to the effects of confounding variables, and in particular population stratification. Using extensive simulations, we demonstrate that the false positive rate under the EPS design is greatly inflated relative to a random sample of equal size or a "case-control"-like design where the cases are from one phenotypic extreme and the controls randomly sampled. The inflated false positive rate is observed even with allele frequency and phenotype mean differences taken from European population data. We show that the effects of confounding are not reduced by increasing the sample size. We also show that including the top principal components in a logistic regression model is sufficient for controlling the type 1 error rate using data simulated with a population genetics model and using 1,000 Genomes genotype data. Our results suggest that when an EPS study is conducted, it is crucial to adjust for all confounding variables. For genetic association studies this requires genotyping a sufficient number of markers to allow for ancestry estimation. Unfortunately, this could increase the costs of a study if sequencing or genotyping was only planned for candidate genes or pathways; the available genetic data would not be suitable for ancestry correction as many of the variants could have a true association with the trait.

摘要

极端表型抽样(EPS)是一种常用的研究设计,用于降低基因分型或测序成本。假设在一个大型队列中可获得连续的表型数据,EPS仅对那些具有极端表型值的个体进行基因分型或测序。尽管这种设计已被证明即使在较小样本量时也具有较高的检测遗传效应的能力,但很少有人关注混杂变量的影响,特别是群体分层的影响。通过广泛的模拟,我们证明,相对于相同大小的随机样本或“病例对照”样设计(其中病例来自一个表型极端,对照随机抽样),EPS设计下的假阳性率大幅膨胀。即使采用来自欧洲人群数据的等位基因频率和表型均值差异,也会观察到假阳性率膨胀。我们表明,增加样本量并不能降低混杂效应。我们还表明,在逻辑回归模型中纳入前几个主成分足以使用群体遗传模型模拟的数据和1000基因组基因型数据来控制I型错误率。我们的结果表明,进行EPS研究时,对所有混杂变量进行调整至关重要。对于基因关联研究,这需要对足够数量的标记进行基因分型以进行祖先估计。不幸的是,如果仅计划对候选基因或途径进行测序或基因分型,这可能会增加研究成本;可用的遗传数据将不适合进行祖先校正,因为许多变异可能与该性状存在真正的关联。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1376/6509877/4cc0a0694fa8/fgene-10-00398-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1376/6509877/1c04e788ca05/fgene-10-00398-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1376/6509877/4cc0a0694fa8/fgene-10-00398-g0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1376/6509877/1c04e788ca05/fgene-10-00398-g0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1376/6509877/4cc0a0694fa8/fgene-10-00398-g0002.jpg

相似文献

1
A Cautionary Note on the Effects of Population Stratification Under an Extreme Phenotype Sampling Design.关于极端表型抽样设计下群体分层效应的警示说明。
Front Genet. 2019 May 3;10:398. doi: 10.3389/fgene.2019.00398. eCollection 2019.
2
Comparison of mixed model based approaches for correcting for population substructure with application to extreme phenotype sampling.比较基于混合模型的方法,用于校正群体亚结构,并将其应用于极端表型抽样。
BMC Genomics. 2022 Feb 4;23(1):98. doi: 10.1186/s12864-022-08297-y.
3
EPS-LASSO: test for high-dimensional regression under extreme phenotype sampling of continuous traits.EPS-LASSO:连续性状极端表型抽样下的高维回归检验。
Bioinformatics. 2018 Jun 15;34(12):1996-2003. doi: 10.1093/bioinformatics/bty042.
4
Powerful extreme phenotype sampling designs and score tests for genetic association studies.用于遗传关联研究的强大极端表型抽样设计和评分检验。
Stat Med. 2018 Dec 10;37(28):4234-4251. doi: 10.1002/sim.7914. Epub 2018 Aug 7.
5
Combined use of phenotypic and genotypic information in sampling animals for genotyping in detection of quantitative trait loci.在对动物进行采样以进行基因分型来检测数量性状基因座时,联合使用表型和基因型信息。
J Anim Breed Genet. 2008 Apr;125(2):100-9. doi: 10.1111/j.1439-0388.2007.00705.x.
6
GMEPS: a fast and efficient likelihood approach for genome-wide mediation analysis under extreme phenotype sequencing.GMEPS:一种用于极端表型测序下全基因组中介分析的快速高效似然方法。
Stat Appl Genet Mol Biol. 2022 Mar 11;21(1):sagmb-2021-0071. doi: 10.1515/sagmb-2021-0071.
7
Simultaneously correcting for population stratification and for genotyping error in case-control association studies.在病例对照关联研究中同时校正群体分层和基因分型错误。
Am J Hum Genet. 2007 Oct;81(4):726-43. doi: 10.1086/520962. Epub 2007 Aug 22.
8
Power in the phenotypic extremes: a simulation study of power in discovery and replication of rare variants.极端表型中的效能:罕见变异发现与复制中效能的模拟研究
Genet Epidemiol. 2011 May;35(4):236-46. doi: 10.1002/gepi.20572.
9
Extreme-phenotype genome-wide association study (XP-GWAS): a method for identifying trait-associated variants by sequencing pools of individuals selected from a diversity panel.极端表型全基因组关联研究(XP-GWAS):一种通过对从多样性面板中选择的个体群体进行测序来识别与特征相关变体的方法。
Plant J. 2015 Nov;84(3):587-96. doi: 10.1111/tpj.13029.
10
Extreme sampling design in genetic association mapping of quantitative trait loci using balanced and unbalanced case-control samples.利用平衡和不平衡病例对照样本进行数量性状基因座遗传关联作图的极端抽样设计。
Sci Rep. 2019 Oct 29;9(1):15504. doi: 10.1038/s41598-019-51790-w.

引用本文的文献

1
Trait-customized sampling of core collections from a winter wheat genebank collection supports association studies.从冬小麦基因库中进行性状定制的核心种质采样有助于关联研究。
Front Plant Sci. 2024 Oct 2;15:1451749. doi: 10.3389/fpls.2024.1451749. eCollection 2024.
2
Genetic polymorphisms linked to extreme postorthodontic external apical root resorption in Koreans.与韩国人正畸后极外侧根尖吸收相关的遗传多态性。
Prog Orthod. 2024 Jun 10;25(1):23. doi: 10.1186/s40510-024-00521-7.
3
Opportunities and challenges for the use of common controls in sequencing studies.

本文引用的文献

1
The impact of a fine-scale population stratification on rare variant association test results.细尺度群体分层对罕见变异关联测试结果的影响。
PLoS One. 2018 Dec 6;13(12):e0207677. doi: 10.1371/journal.pone.0207677. eCollection 2018.
2
On the substructure controls in rare variant analysis: Principal components or variance components?关于罕见变异分析中的亚结构控制:主成分还是方差成分?
Genet Epidemiol. 2018 Apr;42(3):276-287. doi: 10.1002/gepi.22102. Epub 2017 Dec 26.
3
A Nonparametric Regression Approach to Control for Population Stratification in Rare Variant Association Studies.
测序研究中使用常见对照的机遇和挑战。
Nat Rev Genet. 2022 Nov;23(11):665-679. doi: 10.1038/s41576-022-00487-4. Epub 2022 May 17.
4
Comparison of mixed model based approaches for correcting for population substructure with application to extreme phenotype sampling.比较基于混合模型的方法,用于校正群体亚结构,并将其应用于极端表型抽样。
BMC Genomics. 2022 Feb 4;23(1):98. doi: 10.1186/s12864-022-08297-y.
5
A Putative Modifier of Hemochromatosis.血色病的一个假定修饰因子。
Int J Mol Sci. 2021 Jan 27;22(3):1245. doi: 10.3390/ijms22031245.
6
Defining Extreme Phenotypes of OSA Across International Sleep Centers.定义国际睡眠中心中阻塞性睡眠呼吸暂停的极端表型。
Chest. 2020 Sep;158(3):1187-1197. doi: 10.1016/j.chest.2020.03.055. Epub 2020 Apr 15.
7
Effect of population stratification on SNP-by-environment interaction.人群分层对 SNP-环境交互作用的影响。
Genet Epidemiol. 2019 Dec;43(8):1046-1055. doi: 10.1002/gepi.22250. Epub 2019 Aug 20.
一种用于控制罕见变异关联研究中群体分层的非参数回归方法。
Sci Rep. 2016 Nov 18;6:37444. doi: 10.1038/srep37444.
4
Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies via Logistic Mixed Models.通过逻辑混合模型在遗传关联研究中对二元性状的群体结构和相关性进行控制。
Am J Hum Genet. 2016 Apr 7;98(4):653-66. doi: 10.1016/j.ajhg.2016.02.012. Epub 2016 Mar 24.
5
Detecting the Common and Individual Effects of Rare Variants on Quantitative Traits by Using Extreme Phenotype Sampling.利用极端表型抽样检测罕见变异对数量性状的共同和个体效应。
Genes (Basel). 2016 Jan 14;7(1):2. doi: 10.3390/genes7010002.
6
Testing of candidate single nucleotide variants associated with paclitaxel neuropathy in the trial NCCTG N08C1 (Alliance).在试验NCCTG N08C1(联盟)中对与紫杉醇神经病变相关的候选单核苷酸变异进行检测。
Cancer Med. 2016 Apr;5(4):631-9. doi: 10.1002/cam4.625. Epub 2016 Jan 14.
7
Whole exome sequencing identifies novel candidate genes that modify chronic obstructive pulmonary disease susceptibility.全外显子组测序鉴定出可改变慢性阻塞性肺疾病易感性的新型候选基因。
Hum Genomics. 2016 Jan 7;10:1. doi: 10.1186/s40246-015-0058-7.
8
Exome-based case-control association study using extreme phenotype design reveals novel candidates with protective effect in diabetic retinopathy.基于外显子组的病例对照关联研究采用极端表型设计揭示了糖尿病视网膜病变中具有保护作用的新候选基因。
Hum Genet. 2016 Feb;135(2):193-200. doi: 10.1007/s00439-015-1624-8. Epub 2015 Dec 22.
9
A global reference for human genetic variation.人类遗传变异的全球参考。
Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.
10
Identifying genetic risk variants for coronary heart disease in familial hypercholesterolemia: an extreme genetics approach.家族性高胆固醇血症中冠心病遗传风险变异的识别:一种极端遗传学方法。
Eur J Hum Genet. 2015 Mar;23(3):381-7. doi: 10.1038/ejhg.2014.101. Epub 2014 Jun 11.