• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

探索用于全基因组遗传研究的多种关联计算和统计方法。

Exploration of a diversity of computational and statistical measures of association for genome-wide genetic studies.

作者信息

Manduchi Elisabetta, Orzechowski Patryk R, Ritchie Marylyn D, Moore Jason H

机构信息

1Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA USA.

2Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA USA.

出版信息

BioData Min. 2019 Jul 9;12:14. doi: 10.1186/s13040-019-0201-4. eCollection 2019.

DOI:10.1186/s13040-019-0201-4
PMID:31320928
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6617598/
Abstract

BACKGROUND

The principal line of investigation in Genome Wide Association Studies (GWAS) is the identification of main effects, that is individual Single Nucleotide Polymorphisms (SNPs) which are associated with the trait of interest, independent of other factors. A variety of methods have been proposed to this end, mostly statistical in nature and differing in assumptions and type of model employed. Moreover, for a given model, there may be multiple choices for the SNP genotype encoding. As an alternative to statistical methods, machine learning methods are often applicable. Typically, for a given GWAS, a single approach is selected and utilized to identify potential SNPs of interest. Even when multiple GWAS are combined through meta-analyses within a consortium, each GWAS is typically analyzed with a single approach and the resulting summary statistics are then utilized in meta-analyses.

RESULTS

In this work we use as case studies a Type 2 Diabetes (T2D) and a breast cancer GWAS to explore a diversity of applicable approaches spanning different methods and encoding choices. We assess similarity of these approaches based on the derived ranked lists of SNPs and, for each GWAS, we identify a subset of representative approaches that we use as an ensemble to derive a union list of top SNPs. Among these are SNPs which are identified by multiple approaches as well as several SNPs identified by only one or a few of the less frequently used approaches. The latter include SNPs from established loci and SNPs which have other supporting lines of evidence in terms of their potential relevance to the traits.

CONCLUSIONS

Not every main effect analysis method is suitable for every GWAS, but for each GWAS there are typically multiple applicable methods and encoding options. We suggest a workflow for a single GWAS, extensible to multiple GWAS from consortia, where representative approaches are selected among a pool of suitable options, to yield a more comprehensive set of SNPs, potentially including SNPs that would typically be missed with the most popular analyses, but that could provide additional valuable insights for follow-up.

摘要

背景

全基因组关联研究(GWAS)的主要研究方向是识别主效应,即与感兴趣的性状相关的单个单核苷酸多态性(SNP),独立于其他因素。为此已经提出了多种方法,这些方法大多本质上是统计学方法,在假设和所采用的模型类型上有所不同。此外,对于给定的模型,SNP基因型编码可能有多种选择。作为统计方法的替代方案,机器学习方法通常也适用。通常,对于给定的GWAS,会选择并使用单一方法来识别潜在的感兴趣SNP。即使通过联盟内的荟萃分析将多个GWAS合并,每个GWAS通常也采用单一方法进行分析,然后将所得的汇总统计数据用于荟萃分析。

结果

在这项工作中,我们以2型糖尿病(T2D)和乳腺癌GWAS作为案例研究,探索一系列适用于不同方法和编码选择的方法。我们根据导出的SNP排名列表评估这些方法的相似性,并且对于每个GWAS,我们识别出一组代表性方法,将其用作一个整体来得出顶级SNP的联合列表。其中包括通过多种方法识别出的SNP,以及仅由一种或几种较少使用的方法识别出的几个SNP。后者包括来自既定基因座的SNP以及就其与性状的潜在相关性而言有其他支持证据的SNP。

结论

并非每种主效应分析方法都适用于每个GWAS,但对于每个GWAS通常有多种适用方法和编码选项。我们建议了一种适用于单个GWAS的工作流程,该流程可扩展到联盟中的多个GWAS,即在一组合适的选项中选择代表性方法,以产生更全面的SNP集合,可能包括通常在最流行的分析中会遗漏的SNP,但这些SNP可为后续研究提供额外有价值的见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b76b/6617598/b1f04bee6e3c/13040_2019_201_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b76b/6617598/1154644445e9/13040_2019_201_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b76b/6617598/a0af2e564020/13040_2019_201_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b76b/6617598/3cab9f4391db/13040_2019_201_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b76b/6617598/b1f04bee6e3c/13040_2019_201_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b76b/6617598/1154644445e9/13040_2019_201_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b76b/6617598/a0af2e564020/13040_2019_201_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b76b/6617598/3cab9f4391db/13040_2019_201_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b76b/6617598/b1f04bee6e3c/13040_2019_201_Fig4_HTML.jpg

相似文献

1
Exploration of a diversity of computational and statistical measures of association for genome-wide genetic studies.探索用于全基因组遗传研究的多种关联计算和统计方法。
BioData Min. 2019 Jul 9;12:14. doi: 10.1186/s13040-019-0201-4. eCollection 2019.
2
SNP-based pathway enrichment analysis for genome-wide association studies.基于 SNP 的通路富集分析在全基因组关联研究中的应用。
BMC Bioinformatics. 2011 Apr 15;12:99. doi: 10.1186/1471-2105-12-99.
3
Genetic overlap analysis of endometriosis and asthma identifies shared loci implicating sex hormones and thyroid signalling pathways.子宫内膜异位症和哮喘的遗传重叠分析确定了与性激素和甲状腺信号通路相关的共同位点。
Hum Reprod. 2022 Jan 28;37(2):366-383. doi: 10.1093/humrep/deab254.
4
Integrate multiple traits to detect novel trait-gene association using GWAS summary data with an adaptive test approach.利用 GWAS 汇总数据和自适应检验方法整合多种性状,以检测新的性状-基因关联。
Bioinformatics. 2019 Jul 1;35(13):2251-2257. doi: 10.1093/bioinformatics/bty961.
5
Endometrial vezatin and its association with endometriosis risk.子宫内膜 vezatin 及其与子宫内膜异位症风险的关联。
Hum Reprod. 2016 May;31(5):999-1013. doi: 10.1093/humrep/dew047. Epub 2016 Mar 22.
6
Genome-wide association study meta-analysis identifies three novel loci for circulating anti-Müllerian hormone levels in women.全基因组关联研究荟萃分析确定了女性循环抗苗勒管激素水平的三个新基因座。
Hum Reprod. 2022 May 3;37(5):1069-1082. doi: 10.1093/humrep/deac028.
7
SNP eQTL status and eQTL density in the adjacent region of the SNP are associated with its statistical significance in GWA studies.SNP 的 eQTL 状态和 SNP 相邻区域的 eQTL 密度与其在 GWAS 研究中的统计学意义相关。
BMC Genet. 2019 Nov 12;20(1):85. doi: 10.1186/s12863-019-0786-0.
8
Genome-Wide Association Study for Major Biofuel Traits in Sorghum Using Minicore Collection.利用核心种质资源对高粱主要生物燃料性状进行全基因组关联研究。
Protein Pept Lett. 2021;28(8):909-928. doi: 10.2174/0929866528666210215141243.
9
Shared genetic etiology underlying Alzheimer's disease and type 2 diabetes.阿尔茨海默病和2型糖尿病潜在的共同遗传病因。
Mol Aspects Med. 2015 Jun-Oct;43-44:66-76. doi: 10.1016/j.mam.2015.06.006. Epub 2015 Jun 23.
10
Shared genetic factors for age at natural menopause in Iranian and European women.伊朗和欧洲女性自然绝经年龄的共享遗传因素。
Hum Reprod. 2013 Jul;28(7):1987-94. doi: 10.1093/humrep/det106. Epub 2013 Apr 16.

引用本文的文献

1
Genome-wide variants and polygenic risk scores for cognitive impairment following blood or marrow transplantation.全基因组变异和多基因风险评分与血液或骨髓移植后认知障碍的关系。
Bone Marrow Transplant. 2022 Jun;57(6):925-933. doi: 10.1038/s41409-022-01642-5. Epub 2022 Apr 4.
2
A Belief Degree-Associated Fuzzy Multifactor Dimensionality Reduction Framework for Epistasis Detection.基于置信度关联的模糊多因子降维框架用于检测基因互作。
Methods Mol Biol. 2021;2212:307-323. doi: 10.1007/978-1-0716-0947-7_19.

本文引用的文献

1
A neural network based model effectively predicts enhancers from clinical ATAC-seq samples.基于神经网络的模型可有效预测临床 ATAC-seq 样本中的增强子。
Sci Rep. 2018 Oct 30;8(1):16048. doi: 10.1038/s41598-018-34420-9.
2
Fine-mapping type 2 diabetes loci to single-variant resolution using high-density imputation and islet-specific epigenome maps.利用高密度基因分型和胰岛特异性表观基因组图谱对 2 型糖尿病位点进行精细映射到单变体分辨率。
Nat Genet. 2018 Nov;50(11):1505-1513. doi: 10.1038/s41588-018-0241-6. Epub 2018 Oct 8.
3
Common Genetic Variation and Breast Cancer Risk-Past, Present, and Future.
常见遗传变异与乳腺癌风险:过去、现在和未来。
Cancer Epidemiol Biomarkers Prev. 2018 Apr;27(4):380-394. doi: 10.1158/1055-9965.EPI-17-1144. Epub 2018 Jan 30.
4
PLATO software provides analytic framework for investigating complexity beyond genome-wide association studies.PLATO 软件提供了分析框架,用于研究全基因组关联研究之外的复杂性。
Nat Commun. 2017 Oct 27;8(1):1167. doi: 10.1038/s41467-017-00802-2.
5
Identification of new susceptibility loci for type 2 diabetes and shared etiological pathways with coronary heart disease.2型糖尿病新易感基因座的鉴定以及与冠心病的共同病因途径。
Nat Genet. 2017 Oct;49(10):1450-1457. doi: 10.1038/ng.3943. Epub 2017 Sep 4.
6
10 Years of GWAS Discovery: Biology, Function, and Translation.全基因组关联研究十年发现:生物学、功能与转化
Am J Hum Genet. 2017 Jul 6;101(1):5-22. doi: 10.1016/j.ajhg.2017.06.005.
7
Retrospective Binary-Trait Association Test Elucidates Genetic Architecture of Crohn Disease.回顾性二元性状关联测试揭示克罗恩病的遗传结构。
Am J Hum Genet. 2016 Feb 4;98(2):243-55. doi: 10.1016/j.ajhg.2015.12.012. Epub 2016 Jan 28.
8
Mixed model with correction for case-control ascertainment increases association power.针对病例对照确定进行校正的混合模型可提高关联效能。
Am J Hum Genet. 2015 May 7;96(5):720-30. doi: 10.1016/j.ajhg.2015.03.004. Epub 2015 Apr 16.
9
Accurate liability estimation improves power in ascertained case-control studies.准确的责任估计可提高确定病例对照研究的效力。
Nat Methods. 2015 Apr;12(4):332-4. doi: 10.1038/nmeth.3285. Epub 2015 Feb 9.
10
A network biology workflow to study transcriptomics data of the diabetic liver.一种用于研究糖尿病肝脏转录组学数据的网络生物学工作流程。
BMC Genomics. 2014 Nov 15;15(1):971. doi: 10.1186/1471-2164-15-971.