RAPIDSNPs：一种用于快速识别关键基因变异的新计算流程揭示了与个体血小板反应显著相关的先前未识别的单核苷酸多态性。

RAPIDSNPs: A new computational pipeline for rapidly identifying key genetic variants reveals previously unidentified SNPs that are significantly associated with individual platelet responses.

作者信息

Salehe Bajuna Rashid, Jones Chris Ian, Di Fatta Giuseppe, McGuffin Liam James

机构信息

School of Biological Sciences, University of Reading, Reading, United Kingdom.

Department of Computer Science, University of Reading, Reading, United Kingdom.

出版信息

PLoS One. 2017 Apr 25;12(4):e0175957. doi: 10.1371/journal.pone.0175957. eCollection 2017.

DOI:10.1371/journal.pone.0175957

PMID:28441463

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5404774/

Abstract

Advances in omics technologies have led to the discovery of genetic markers, or single nucleotide polymorphisms (SNPs), that are associated with particular diseases or complex traits. Although there have been significant improvements in the approaches used to analyse associations of SNPs with disease, further optimised and rapid techniques are needed to keep up with the rate of SNP discovery, which has exacerbated the 'missing heritability' problem. Here, we have devised a novel, integrated, heuristic-based, hybrid analytical computational pipeline, for rapidly detecting novel or key genetic variants that are associated with diseases or complex traits. Our pipeline is particularly useful in genetic association studies where the genotyped SNP data are highly dimensional, and the complex trait phenotype involved is continuous. In particular, the pipeline is more efficient for investigating small sets of genotyped SNPs defined in high dimensional spaces that may be associated with continuous phenotypes, rather than for the investigation of whole genome variants. The pipeline, which employs a consensus approach based on the random forest, was able to rapidly identify previously unseen key SNPs, that are significantly associated with the platelet response phenotype, which was used as our complex trait case study. Several of these SNPs, such as rs6141803 of COMMD7 and rs41316468 in PKT2B, have independently confirmed associations with cardiovascular diseases (CVDs) according to other unrelated studies, suggesting that our pipeline is robust in identifying key genetic variants. Our new pipeline provides an important step towards addressing the problem of 'missing heritability' through enhanced detection of key genetic variants (SNPs) that are associated with continuous complex traits/disease phenotypes.

摘要

组学技术的进步促使人们发现了与特定疾病或复杂性状相关的遗传标记，即单核苷酸多态性（SNP）。尽管在分析SNP与疾病关联的方法上已经取得了显著进展，但仍需要进一步优化和快速的技术来跟上SNP发现的速度，因为这加剧了“遗传力缺失”问题。在此，我们设计了一种新颖的、集成的、基于启发式的混合分析计算流程，用于快速检测与疾病或复杂性状相关的新的或关键的遗传变异。我们的流程在基因关联研究中特别有用，其中基因分型的SNP数据具有高维度，且所涉及的复杂性状表型是连续的。特别是，该流程在研究高维空间中定义的可能与连续表型相关的少量基因分型SNP时更有效，而不是用于研究全基因组变异。该流程采用基于随机森林的共识方法，能够快速识别先前未发现的与血小板反应表型显著相关的关键SNP，我们将血小板反应表型用作复杂性状案例研究。根据其他不相关的研究，这些SNP中的几个，如COMMD7的rs6141803和PKT2B中的rs41316468，已独立证实与心血管疾病（CVD）相关，这表明我们的流程在识别关键遗传变异方面是稳健的。我们的新流程朝着通过增强检测与连续复杂性状/疾病表型相关的关键遗传变异（SNP）来解决“遗传力缺失”问题迈出了重要一步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/561d/5404774/905928e478dd/pone.0175957.g001.jpg

相似文献

RAPIDSNPs: A new computational pipeline for rapidly identifying key genetic variants reveals previously unidentified SNPs that are significantly associated with individual platelet responses.RAPIDSNPs：一种用于快速识别关键基因变异的新计算流程揭示了与个体血小板反应显著相关的先前未识别的单核苷酸多态性。

PLoS One. 2017 Apr 25;12(4):e0175957. doi: 10.1371/journal.pone.0175957. eCollection 2017.

Weighted Interaction SNP Hub (WISH) network method for building genetic networks for complex diseases and traits using whole genome genotype data.加权交互作用单核苷酸多态性中心（WISH）网络方法：利用全基因组基因型数据构建复杂疾病和性状的遗传网络

BMC Syst Biol. 2014;8 Suppl 2(Suppl 2):S5. doi: 10.1186/1752-0509-8-S2-S5. Epub 2014 Mar 13.

The Relative Power of Structural Genomic Variation versus SNPs in Explaining the Quantitative Trait Growth in the Marine Teleost .结构基因组变异与单核苷酸多态性在解释海洋硬骨鱼类数量性状生长中的相对作用

Genes (Basel). 2022 Jun 23;13(7):1129. doi: 10.3390/genes13071129.

Development and validation of a 20K single nucleotide polymorphism (SNP) whole genome genotyping array for apple (Malus × domestica Borkh).用于苹果（Malus × domestica Borkh）的20K单核苷酸多态性（SNP）全基因组基因分型芯片的开发与验证

PLoS One. 2014 Oct 10;9(10):e110377. doi: 10.1371/journal.pone.0110377. eCollection 2014.

Accuracy of prediction of simulated polygenic phenotypes and their underlying quantitative trait loci genotypes using real or imputed whole-genome markers in cattle.利用真实或推算的全基因组标记预测牛模拟多基因表型及其潜在数量性状位点基因型的准确性。

Genet Sel Evol. 2015 Dec 23;47:99. doi: 10.1186/s12711-015-0179-4.

Gene-based single nucleotide polymorphism discovery in bovine muscle using next-generation transcriptomic sequencing.利用下一代转录组测序技术在牛肌肉中进行基于基因的单核苷酸多态性发现。

BMC Genomics. 2013 May 7;14:307. doi: 10.1186/1471-2164-14-307.

SNPs in Multi-species Conserved Sequences (MCS) as useful markers in association studies: a practical approach.多物种保守序列（MCS）中的单核苷酸多态性（SNPs）作为关联研究中的有用标记：一种实用方法。

BMC Genomics. 2007 Aug 6;8:266. doi: 10.1186/1471-2164-8-266.

Endometrial vezatin and its association with endometriosis risk.子宫内膜 vezatin 及其与子宫内膜异位症风险的关联。

Hum Reprod. 2016 May;31(5):999-1013. doi: 10.1093/humrep/dew047. Epub 2016 Mar 22.

Multitrait genome association analysis identifies new susceptibility genes for human anthropometric variation in the GCAT cohort.多性状全基因组关联分析鉴定 GCAT 队列中人类人体测量变异的新易感基因。

J Med Genet. 2018 Nov;55(11):765-778. doi: 10.1136/jmedgenet-2018-105437. Epub 2018 Aug 30.

Discovery of shared genomic loci using the conditional false discovery rate approach.利用条件 FDR 方法发现共享基因组座。

Hum Genet. 2020 Jan;139(1):85-94. doi: 10.1007/s00439-019-02060-2. Epub 2019 Sep 13.

引用本文的文献

CRISPR-edited megakaryocytes for rapid screening of platelet gene functions.经 CRISPR 编辑的巨核细胞用于快速筛选血小板基因功能。

Blood Adv. 2021 May 11;5(9):2362-2374. doi: 10.1182/bloodadvances.2020004112.

Sensitivity analysis based on the random forest machine learning algorithm identifies candidate genes for regulation of innate and adaptive immune response of chicken.基于随机森林机器学习算法的敏感性分析确定了调节鸡固有和适应性免疫反应的候选基因。

Poult Sci. 2020 Dec;99(12):6341-6354. doi: 10.1016/j.psj.2020.08.059. Epub 2020 Sep 12.

Functional Genomics for the Identification of Modulators of Platelet-Dependent Thrombus Formation.用于鉴定血小板依赖性血栓形成调节剂的功能基因组学

TH Open. 2018 Sep 10;2(3):e272-e279. doi: 10.1055/s-0038-1670630. eCollection 2018 Jul.

本文引用的文献

Genome-wide association data classification and SNPs selection using two-stage quality-based Random Forests.使用基于质量的两阶段随机森林进行全基因组关联数据分类和单核苷酸多态性选择。

BMC Genomics. 2015;16 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2164-16-S2-S5. Epub 2015 Jan 21.

Methods of integrating data to uncover genotype-phenotype interactions.整合数据以揭示基因型-表型相互作用的方法。

Nat Rev Genet. 2015 Feb;16(2):85-97. doi: 10.1038/nrg3868. Epub 2015 Jan 13.

Explaining additional genetic variation in complex traits.解释复杂性状中的额外遗传变异。

Trends Genet. 2014 Apr;30(4):124-32. doi: 10.1016/j.tig.2014.02.003. Epub 2014 Mar 11.

Robustness of Random Forest-based gene selection methods.基于随机森林的基因选择方法的稳健性。

BMC Bioinformatics. 2014 Jan 13;15:8. doi: 10.1186/1471-2105-15-8.

Long range linkage disequilibrium across the human genome.人类基因组中的长程连锁不平衡。

PLoS One. 2013 Dec 12;8(12):e80754. doi: 10.1371/journal.pone.0080754. eCollection 2013.

GWAS3D: Detecting human regulatory variants by integrative analysis of genome-wide associations, chromosome interactions and histone modifications.GWAS3D：通过全基因组关联、染色体相互作用和组蛋白修饰的综合分析来检测人类调控变体。

Nucleic Acids Res. 2013 Jul;41(Web Server issue):W150-8. doi: 10.1093/nar/gkt456. Epub 2013 May 30.

Impacts of CA9 gene polymorphisms and environmental factors on oral-cancer susceptibility and clinicopathologic characteristics in Taiwan.CA9 基因多态性与环境因素对台湾地区口腔癌易感性及临床病理特征的影响。

PLoS One. 2012;7(12):e51051. doi: 10.1371/journal.pone.0051051. Epub 2012 Dec 4.

SNP selection and classification of genome-wide SNP data using stratified sampling random forests.基于分层抽样随机森林的全基因组 SNP 数据 SNP 选择与分类。

IEEE Trans Nanobioscience. 2012 Sep;11(3):216-27. doi: 10.1109/TNB.2012.2214232.

Random forests for genetic association studies.用于基因关联研究的随机森林算法。

Stat Appl Genet Mol Biol. 2011;10(1):32. doi: 10.2202/1544-6115.1691. Epub 2011 Jul 12.

Discovery and Replication of Gene Influences on Brain Structure Using LASSO Regression.使用套索回归法发现并复制基因对脑结构的影响

Front Neurosci. 2012 Aug 6;6:115. doi: 10.3389/fnins.2012.00115. eCollection 2012.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

RAPIDSNPs：一种用于快速识别关键基因变异的新计算流程揭示了与个体血小板反应显著相关的先前未识别的单核苷酸多态性。

RAPIDSNPs: A new computational pipeline for rapidly identifying key genetic variants reveals previously unidentified SNPs that are significantly associated with individual platelet responses.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献