一种用于研究单核苷酸多态性与疾病的集成数据库-管道系统。

An integrated database-pipeline system for studying single nucleotide polymorphisms and diseases.

作者信息

Yang Jin Ok, Hwang Sohyun, Oh Jeongsu, Bhak Jong, Sohn Tae-Kwon

机构信息

Korean BioInformation Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon, 305-806, Korea.

出版信息

BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S19. doi: 10.1186/1471-2105-9-S12-S19.

DOI:10.1186/1471-2105-9-S12-S19

PMID:19091018

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2638159/

Abstract

BACKGROUND

Studies on the relationship between disease and genetic variations such as single nucleotide polymorphisms (SNPs) are important. Genetic variations can cause disease by influencing important biological regulation processes. Despite the needs for analyzing SNP and disease correlation, most existing databases provide information only on functional variants at specific locations on the genome, or deal with only a few genes associated with disease. There is no combined resource to widely support gene-, SNP-, and disease-related information, and to capture relationships among such data. Therefore, we developed an integrated database-pipeline system for studying SNPs and diseases.

RESULTS

To implement the pipeline system for the integrated database, we first unified complicated and redundant disease terms and gene names using the Unified Medical Language System (UMLS) for classification and noun modification, and the HUGO Gene Nomenclature Committee (HGNC) and NCBI gene databases. Next, we collected and integrated representative databases for three categories of information. For genes and proteins, we examined the NCBI mRNA, UniProt, UCSC Table Track and MitoDat databases. For genetic variants we used the dbSNP, JSNP, ALFRED, and HGVbase databases. For disease, we employed OMIM, GAD, and HGMD databases. The database-pipeline system provides a disease thesaurus, including genes and SNPs associated with disease. The search results for these categories are available on the web page http://diseasome.kobic.re.kr/, and a genome browser is also available to highlight findings, as well as to permit the convenient review of potentially deleterious SNPs among genes strongly associated with specific diseases and clinical phenotypes.

CONCLUSION

Our system is designed to capture the relationships between SNPs associated with disease and disease-causing genes. The integrated database-pipeline provides a list of candidate genes and SNP markers for evaluation in both epidemiological and molecular biological approaches to diseases-gene association studies. Furthermore, researchers then can decide semi-automatically the data set for association studies while considering the relationships between genetic variation and diseases. The database can also be economical for disease-association studies, as well as to facilitate an understanding of the processes which cause disease. Currently, the database contains 14,674 SNP records and 109,715 gene records associated with human diseases and it is updated at regular intervals.

摘要

背景

研究疾病与单核苷酸多态性（SNP）等基因变异之间的关系非常重要。基因变异可通过影响重要的生物调节过程导致疾病。尽管需要分析SNP与疾病的相关性，但大多数现有数据库仅提供基因组特定位置的功能变异信息，或仅处理少数与疾病相关的基因。目前尚无综合资源能广泛支持与基因、SNP和疾病相关的信息，并捕捉这些数据之间的关系。因此，我们开发了一个用于研究SNP与疾病的综合数据库 - 管道系统。

结果

为实现综合数据库的管道系统，我们首先使用统一医学语言系统（UMLS）进行分类和名词修饰，并借助HUGO基因命名委员会（HGNC）和NCBI基因数据库，统一复杂且冗余的疾病术语和基因名称。接下来，我们收集并整合了三类信息的代表性数据库。对于基因和蛋白质，我们研究了NCBI mRNA、UniProt、UCSC Table Track和MitoDat数据库。对于基因变异，我们使用了dbSNP、JSNP、ALFRED和HGVbase数据库。对于疾病，我们采用了OMIM、GAD和HGMD数据库。该数据库 - 管道系统提供了一个疾病词库，包括与疾病相关的基因和SNP。这些类别的搜索结果可在网页http://diseasome.kobic.re.kr/上获取，同时还提供了一个基因组浏览器，用于突出显示研究结果，并便于查看与特定疾病和临床表型密切相关的基因中潜在有害的SNP。

结论

我们的系统旨在捕捉与疾病相关的SNP和致病基因之间的关系。该综合数据库 - 管道为疾病 - 基因关联研究的流行病学和分子生物学方法评估提供了候选基因和SNP标记列表。此外，研究人员在考虑基因变异与疾病之间的关系时，可以半自动地确定关联研究的数据集。该数据库对于疾病关联研究也较为经济实惠，有助于理解导致疾病产生的过程。目前，该数据库包含14,674条与人类疾病相关的SNP记录和109,715条基因记录，并定期更新。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cb8/2638159/1baf6f14793c/1471-2105-9-S12-S19-1.jpg

相似文献

An integrated database-pipeline system for studying single nucleotide polymorphisms and diseases.一种用于研究单核苷酸多态性与疾病的集成数据库-管道系统。

BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S19. doi: 10.1186/1471-2105-9-S12-S19.

MedRefSNP: a database of medically investigated SNPs.医学参考单核苷酸多态性数据库：一个对单核苷酸多态性进行医学研究的数据库。

Hum Mutat. 2009 Mar;30(3):E460-6. doi: 10.1002/humu.20914.

LDGIdb: a database of gene interactions inferred from long-range strong linkage disequilibrium between pairs of SNPs.LDGIdb：一个从单核苷酸多态性（SNP）对之间的长程强连锁不平衡推断出的基因相互作用数据库。

BMC Res Notes. 2012 May 2;5:212. doi: 10.1186/1756-0500-5-212.

HGVbase: a human sequence variation database emphasizing data quality and a broad spectrum of data sources.HGVbase：一个强调数据质量和广泛数据源的人类序列变异数据库。

Nucleic Acids Res. 2002 Jan 1;30(1):387-91. doi: 10.1093/nar/30.1.387.

JSNP: a database of common gene variations in the Japanese population.JSNP：日本人群常见基因变异数据库。

Nucleic Acids Res. 2002 Jan 1;30(1):158-62. doi: 10.1093/nar/30.1.158.

SNPranker 2.0: a gene-centric data mining tool for diseases associated SNP prioritization in GWAS.SNPranker 2.0：一种针对 GWAS 中疾病相关 SNP 优先级排序的基于基因的数据分析工具。

BMC Bioinformatics. 2013;14 Suppl 1(Suppl 1):S9. doi: 10.1186/1471-2105-14-S1-S9. Epub 2013 Jan 14.

A community-based resource for automatic exome variant-calling and annotation in Mendelian disorders.一个基于社区的用于孟德尔疾病中自动外显子组变异检测和注释的资源。

BMC Genomics. 2014;15 Suppl 3(Suppl 3):S5. doi: 10.1186/1471-2164-15-S3-S5. Epub 2014 May 6.

SNP2NMD: a database of human single nucleotide polymorphisms causing nonsense-mediated mRNA decay.SNP2NMD：一个导致无义介导的mRNA降解的人类单核苷酸多态性数据库。

Bioinformatics. 2007 Feb 1;23(3):397-9. doi: 10.1093/bioinformatics/btl593. Epub 2006 Nov 22.

F-SNP: computationally predicted functional SNPs for disease association studies.F-SNP：用于疾病关联研究的计算预测功能单核苷酸多态性

Nucleic Acids Res. 2008 Jan;36(Database issue):D820-4. doi: 10.1093/nar/gkm904. Epub 2007 Nov 5.

SNPHunter: a bioinformatic software for single nucleotide polymorphism data acquisition and management.SNPHunter：一款用于单核苷酸多态性数据采集与管理的生物信息学软件。

BMC Bioinformatics. 2005 Mar 18;6:60. doi: 10.1186/1471-2105-6-60.

引用本文的文献

How to build personalized multi-omics comorbidity profiles.如何构建个性化的多组学合并症图谱。

Front Cell Dev Biol. 2015 Jun 24;3:28. doi: 10.3389/fcell.2015.00028. eCollection 2015.

Detection and analysis of disease-associated single nucleotide polymorphism influencing post-translational modification.影响翻译后修饰的疾病相关单核苷酸多态性的检测与分析

BMC Med Genomics. 2015;8 Suppl 2(Suppl 2):S7. doi: 10.1186/1755-8794-8-S2-S7. Epub 2015 May 29.

Extrapolating the effect of deleterious nsSNPs in the binding adaptability of flavopiridol with CDK7 protein: a molecular dynamics approach.从结合适应性方面外推有害 nsSNP 对 flavopiridol 与 CDK7 蛋白结合的影响：一种分子动力学方法。

Hum Genomics. 2013 Apr 5;7(1):10. doi: 10.1186/1479-7364-7-10.

Drug repurposing: far beyond new targets for old drugs.药物重定位：远不止老药新靶。

AAPS J. 2012 Dec;14(4):759-63. doi: 10.1208/s12248-012-9390-1. Epub 2012 Jul 24.

VnD: a structure-centric database of disease-related SNPs and drugs.VnD：一个以结构为中心的疾病相关单核苷酸多态性和药物数据库。

Nucleic Acids Res. 2011 Jan;39(Database issue):D939-44. doi: 10.1093/nar/gkq957. Epub 2010 Nov 4.

Mutation@A Glance: an integrative web application for analysing mutations from human genetic diseases.突变一览：一个综合性的网络应用程序，用于分析人类遗传疾病中的突变。

DNA Res. 2010 Jun;17(3):197-208. doi: 10.1093/dnares/dsq010. Epub 2010 Apr 1.

PhosSNP for systematic analysis of genetic polymorphisms that influence protein phosphorylation.PhosSNP 用于系统分析影响蛋白质磷酸化的遗传多态性。

Mol Cell Proteomics. 2010 Apr;9(4):623-34. doi: 10.1074/mcp.M900273-MCP200. Epub 2009 Dec 8.

PDbase: a database of Parkinson's disease-related genes and genetic variation using substantia nigra ESTs.PDbase：一个使用黑质 EST 构建的帕金森病相关基因和遗传变异数据库。

BMC Genomics. 2009 Dec 3;10 Suppl 3(Suppl 3):S32. doi: 10.1186/1471-2164-10-S3-S32.

Emerging strengths in Asia Pacific bioinformatics.亚太地区生物信息学的新兴优势。

BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S1. doi: 10.1186/1471-2105-9-S12-S1.

本文引用的文献

Identification of SNP markers for common CNV regions and association analysis of risk of subarachnoid aneurysmal hemorrhage in Japanese population.日本人群中常见拷贝数变异区域单核苷酸多态性标记的鉴定及蛛网膜下腔动脉瘤性出血风险的关联分析。

Biochem Biophys Res Commun. 2008 Sep 5;373(4):593-6. doi: 10.1016/j.bbrc.2008.06.083. Epub 2008 Jul 2.

SNP@Promoter: a database of human SNPs (single nucleotide polymorphisms) within the putative promoter regions.SNP@启动子：一个关于假定启动子区域内人类单核苷酸多态性（SNP）的数据库。

BMC Bioinformatics. 2008;9 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2105-9-S1-S2.

Structural genomic variation in ischemic stroke.缺血性卒中的结构基因组变异

Neurogenetics. 2008 May;9(2):101-8. doi: 10.1007/s10048-008-0119-3. Epub 2008 Feb 21.

Database resources of the National Center for Biotechnology Information.美国国立生物技术信息中心的数据库资源。

Nucleic Acids Res. 2008 Jan;36(Database issue):D13-21. doi: 10.1093/nar/gkm1000. Epub 2007 Nov 27.

Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines.基于支持向量机预测非同义单核苷酸多态性的表型效应。

BMC Bioinformatics. 2007 Nov 16;8:450. doi: 10.1186/1471-2105-8-450.

The UCSC genome browser database: update 2007.加州大学圣克鲁兹分校基因组浏览器数据库：2007年更新

Nucleic Acids Res. 2007 Jan;35(Database issue):D668-73. doi: 10.1093/nar/gkl928. Epub 2006 Nov 16.

Mapping of a gene causing brittle cornea syndrome in Tunisian jews to 16q24.突尼斯犹太人中导致角膜脆弱综合征的一个基因定位于16q24。

Invest Ophthalmol Vis Sci. 2006 Dec;47(12):5283-7. doi: 10.1167/iovs.06-0206.

Genomic rearrangements and gene copy-number alterations as a cause of nervous system disorders.基因组重排和基因拷贝数改变作为神经系统疾病的一个病因

Neuron. 2006 Oct 5;52(1):103-21. doi: 10.1016/j.neuron.2006.09.027.

SNP@Domain: a web resource of single nucleotide polymorphisms (SNPs) within protein domain structures and sequences.SNP@Domain：蛋白质结构域结构和序列中单个核苷酸多态性（SNP）的网络资源。

Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W642-4. doi: 10.1093/nar/gkl323.

The HUGO Gene Nomenclature Database, 2006 updates.《人类基因组组织基因命名数据库》2006年更新版。

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D319-21. doi: 10.1093/nar/gkj147.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种用于研究单核苷酸多态性与疾病的集成数据库-管道系统。

An integrated database-pipeline system for studying single nucleotide polymorphisms and diseases.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献