• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于全基因组单核苷酸多态性(SNP)选择和SNP网络构建的缩减方法。

Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks.

作者信息

Liu Yang, Ng Michael

机构信息

Centre for Mathematical Imaging and Vision, Hong Kong Baptist University, Hong Kong.

出版信息

BMC Syst Biol. 2010 Sep 13;4 Suppl 2(Suppl 2):S5. doi: 10.1186/1752-0509-4-S2-S5.

DOI:10.1186/1752-0509-4-S2-S5
PMID:20840732
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2982692/
Abstract

BACKGROUND

Recent development of high-resolution single nucleotide polymorphism (SNP) arrays allows detailed assessment of genome-wide human genome variations. There is increasing recognition of the importance of SNPs for medicine and developmental biology. However, SNP data set typically has a large number of SNPs (e.g., 400 thousand SNPs in genome-wide Parkinson disease data set) and a few hundred of samples. Conventional classification methods may not be effective when applied to such genome-wide SNP data.

RESULTS

In this paper, we use shrunken dissimilarity measure to analyze and select relevant SNPs for classification problems. Examples of HapMap data and Parkinson disease (PD) data are given to demonstrate the effectiveness of the proposed method, and illustrate it has a potential to become a useful analysis tool for SNP data sets. We use Parkinson disease data as an example, and perform a whole genome analysis. For the 367440 SNPs with less than 1% missing percentage from all 22 chromosomes, we can select 357 SNPs from this data set. For the unique genes that those SNPs are located in, a gene-gene similarity value is computed using GOSemSim and gene pairs that has a similarity value being greater than a threshold are selected to construct several groups of genes. For the SNPs that involved in these groups of genes, a statistical software PLINK is employed to compute the pair-wise SNP-SNP interactions, and SNPs with significance of P < 0.01 are chosen to identify SNPs networks based on their P values. Here SNPs networks are constructed based on Gene Ontology knowledge, and therefore each SNP network plays a role in the biological process. An analysis shows that such networks have relationships directly or indirectly to Parkinson disease.

CONCLUSIONS

Experimental results show that our approach is suitable to handle genetic variations, and provide useful knowledge in a genome-wide SNP study.

摘要

背景

高分辨率单核苷酸多态性(SNP)阵列的最新发展使得能够对全基因组人类基因组变异进行详细评估。人们越来越认识到SNP在医学和发育生物学中的重要性。然而,SNP数据集通常包含大量的SNP(例如,全基因组帕金森病数据集中有40万个SNP)和几百个样本。传统的分类方法应用于此类全基因组SNP数据时可能无效。

结果

在本文中,我们使用收缩差异度量来分析和选择用于分类问题的相关SNP。给出了HapMap数据和帕金森病(PD)数据的示例,以证明所提出方法的有效性,并说明它有潜力成为SNP数据集的有用分析工具。我们以帕金森病数据为例,进行全基因组分析。对于来自所有22条染色体的缺失率小于1%的367440个SNP,我们可以从该数据集中选择357个SNP。对于这些SNP所在的独特基因,使用GOSemSim计算基因-基因相似性值,并选择相似性值大于阈值的基因对来构建几组基因。对于涉及这些基因组的SNP,使用统计软件PLINK计算成对的SNP-SNP相互作用,并选择P<0.01的SNP根据其P值识别SNP网络。这里基于基因本体知识构建SNP网络,因此每个SNP网络在生物过程中发挥作用。分析表明,这样的网络与帕金森病直接或间接相关。

结论

实验结果表明,我们的方法适用于处理遗传变异,并在全基因组SNP研究中提供有用的知识。

相似文献

1
Shrunken methodology to genome-wide SNPs selection and construction of SNPs networks.用于全基因组单核苷酸多态性(SNP)选择和SNP网络构建的缩减方法。
BMC Syst Biol. 2010 Sep 13;4 Suppl 2(Suppl 2):S5. doi: 10.1186/1752-0509-4-S2-S5.
2
SNP and gene networks construction and analysis from classification of copy number variations data.从拷贝数变异数据的分类中构建 SNP 和基因网络并进行分析。
BMC Bioinformatics. 2011;12 Suppl 5(Suppl 5):S4. doi: 10.1186/1471-2105-12-S5-S4. Epub 2011 Jul 27.
3
Genome-wide association data classification and SNPs selection using two-stage quality-based Random Forests.使用基于质量的两阶段随机森林进行全基因组关联数据分类和单核苷酸多态性选择。
BMC Genomics. 2015;16 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2164-16-S2-S5. Epub 2015 Jan 21.
4
SNP selection and classification of genome-wide SNP data using stratified sampling random forests.基于分层抽样随机森林的全基因组 SNP 数据 SNP 选择与分类。
IEEE Trans Nanobioscience. 2012 Sep;11(3):216-27. doi: 10.1109/TNB.2012.2214232.
5
Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.未分型标记的全基因组推断准确性及其对关联研究统计效能的影响。
BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27.
6
Gene, pathway and network frameworks to identify epistatic interactions of single nucleotide polymorphisms derived from GWAS data.用于识别源自全基因组关联研究(GWAS)数据的单核苷酸多态性上位性相互作用的基因、通路和网络框架。
BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S15. doi: 10.1186/1752-0509-6-S3-S15. Epub 2012 Dec 17.
7
Construction and analysis of single nucleotide polymorphism-single nucleotide polymorphism interaction networks.构建和分析单核苷酸多态性-单核苷酸多态性相互作用网络。
IET Syst Biol. 2013 Oct;7(5):170-81. doi: 10.1049/iet-syb.2012.0055.
8
Finding type 2 diabetes causal single nucleotide polymorphism combinations and functional modules from genome-wide association data.从全基因组关联数据中找到 2 型糖尿病因果单核苷酸多态性组合和功能模块。
BMC Med Inform Decis Mak. 2013;13 Suppl 1(Suppl 1):S3. doi: 10.1186/1472-6947-13-S1-S3. Epub 2013 Apr 5.
9
High-resolution whole-genome association study of Parkinson disease.帕金森病的高分辨率全基因组关联研究。
Am J Hum Genet. 2005 Nov;77(5):685-93. doi: 10.1086/496902. Epub 2005 Sep 9.
10
SNP-based pathway enrichment analysis for genome-wide association studies.基于 SNP 的通路富集分析在全基因组关联研究中的应用。
BMC Bioinformatics. 2011 Apr 15;12:99. doi: 10.1186/1471-2105-12-99.

引用本文的文献

1
Role of mitochondrial genetic interactions in determining adaptation to high altitude human population.线粒体遗传相互作用在决定人类对高海拔环境适应中的作用。
Sci Rep. 2022 Feb 7;12(1):2046. doi: 10.1038/s41598-022-05719-5.
2
Biological networks in Parkinson's disease: an insight into the epigenetic mechanisms associated with this disease.帕金森病中的生物网络:对与该疾病相关的表观遗传机制的洞察。
BMC Genomics. 2017 Sep 12;18(1):721. doi: 10.1186/s12864-017-4098-3.
3
Rising Strengths Hong Kong SAR in Bioinformatics.香港特区生物信息学实力不断增强。

本文引用的文献

1
Genomewide association study for onset age in Parkinson disease.帕金森病发病年龄的全基因组关联研究。
BMC Med Genet. 2009 Sep 22;10:98. doi: 10.1186/1471-2350-10-98.
2
SHARE: an adaptive algorithm to select the most informative set of SNPs for candidate genetic association.SHARE:一种用于为候选基因关联选择信息量最大的单核苷酸多态性(SNP)集合的自适应算法。
Biostatistics. 2009 Oct;10(4):680-93. doi: 10.1093/biostatistics/kxp023. Epub 2009 Jul 15.
3
SNPinfo: integrating GWAS and candidate gene information into functional SNP selection for genetic association studies.
Interdiscip Sci. 2017 Jun;9(2):224-236. doi: 10.1007/s12539-016-0147-x. Epub 2016 Mar 9.
4
Mutated Pathways as a Guide to Adjuvant Therapy Treatments for Breast Cancer.突变通路作为乳腺癌辅助治疗的指导
Mol Cancer Ther. 2016 Jan;15(1):184-9. doi: 10.1158/1535-7163.MCT-15-0601. Epub 2015 Dec 1.
5
Evaluation and integration of cancer gene classifiers: identification and ranking of plausible drivers.癌症基因分类器的评估与整合:合理驱动因素的识别与排序
Sci Rep. 2015 May 11;5:10204. doi: 10.1038/srep10204.
6
Construction and analysis of single nucleotide polymorphism-single nucleotide polymorphism interaction networks.构建和分析单核苷酸多态性-单核苷酸多态性相互作用网络。
IET Syst Biol. 2013 Oct;7(5):170-81. doi: 10.1049/iet-syb.2012.0055.
SNPinfo:将全基因组关联研究(GWAS)和候选基因信息整合到用于基因关联研究的功能性单核苷酸多态性(SNP)选择中。
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W600-5. doi: 10.1093/nar/gkp290. Epub 2009 May 5.
4
Genome-wide linkage screen in familial Parkinson disease identifies loci on chromosomes 3 and 18.家族性帕金森病的全基因组连锁分析确定了3号和18号染色体上的基因座。
Am J Hum Genet. 2009 Apr;84(4):499-504. doi: 10.1016/j.ajhg.2009.03.005. Epub 2009 Mar 26.
5
Classification with high-dimensional genetic data: assigning patients and genetic features to known classes.利用高维基因数据进行分类:将患者和基因特征归入已知类别。
Biom J. 2008 Dec;50(6):911-26. doi: 10.1002/bimj.200810475.
6
Unidimensional nonnegative scaling for genome-wide linkage disequilibrium maps.
Int J Bioinform Res Appl. 2008;4(4):417-34. doi: 10.1504/IJBRA.2008.021177.
7
Whole genome survey of coding SNPs reveals a reproducible pathway determinant of Parkinson disease.编码单核苷酸多态性的全基因组调查揭示了帕金森病一种可重复的通路决定因素。
Hum Mutat. 2009 Feb;30(2):228-38. doi: 10.1002/humu.20840.
8
PLINK: a tool set for whole-genome association and population-based linkage analyses.PLINK:一个用于全基因组关联分析和基于群体的连锁分析的工具集。
Am J Hum Genet. 2007 Sep;81(3):559-75. doi: 10.1086/519795. Epub 2007 Jul 25.
9
SNP@Ethnos: a database of ethnically variant single-nucleotide polymorphisms.SNP@Ethnos:一个关于种族特异性单核苷酸多态性的数据库。
Nucleic Acids Res. 2007 Jan;35(Database issue):D711-5. doi: 10.1093/nar/gkl962. Epub 2006 Nov 28.
10
[SNP markers: methods of analysis, ways of development, and comparison on an example of common wheat].[单核苷酸多态性标记:分析方法、开发途径及以普通小麦为例的比较]
Genetika. 2006 Jun;42(6):725-36.