对人类基因组编码碱基变化的疾病易感性进行详尽预测。

Exhaustive prediction of disease susceptibility to coding base changes in the human genome.

作者信息

Kulkarni Vinayak, Errami Mounir, Barber Robert, Garner Harold R

机构信息

Mc Dermott Center for Human Growth and Development, UT Southwestern Medical Center, Dallas, TX, USA.

出版信息

BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S3. doi: 10.1186/1471-2105-9-S9-S3.

DOI:10.1186/1471-2105-9-S9-S3

PMID:18793467

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2537574/

Abstract

BACKGROUND

Single Nucleotide Polymorphisms (SNPs) are the most abundant form of genomic variation and can cause phenotypic differences between individuals, including diseases. Bases are subject to various levels of selection pressure, reflected in their inter-species conservation.

RESULTS

We propose a method that is not dependant on transcription information to score each coding base in the human genome reflecting the disease probability associated with its mutation. Twelve factors likely to be associated with disease alleles were chosen as the input for a support vector machine prediction algorithm. The analysis yielded 83% sensitivity and 84% specificity in segregating disease like alleles as found in the Human Gene Mutation Database from non-disease like alleles as found in the Database of Single Nucleotide Polymorphisms. This algorithm was subsequently applied to each base within all known human genes, exhaustively confirming that interspecies conservation is the strongest factor for disease association. For each gene, the length normalized average disease potential score was calculated. Out of the 30 genes with the highest scores, 21 are directly associated with a disease. In contrast, out of the 30 genes with the lowest scores, only one is associated with a disease as found in published literature. The results strongly suggest that the highest scoring genes are enriched for those that might contribute to disease, if mutated.

CONCLUSION

This method provides valuable information to researchers to identify sensitive positions in genes that have a high disease probability, enabling them to optimize experimental designs and interpret data emerging from genetic and epidemiological studies.

摘要

背景

单核苷酸多态性（SNPs）是基因组变异最丰富的形式，可导致个体间的表型差异，包括疾病。碱基受到不同程度的选择压力，这在它们的种间保守性中得以体现。

结果

我们提出了一种不依赖转录信息的方法，对人类基因组中的每个编码碱基进行评分，以反映与其突变相关的疾病概率。选择了十二个可能与疾病等位基因相关的因素作为支持向量机预测算法的输入。在将人类基因突变数据库中发现的疾病样等位基因与单核苷酸多态性数据库中发现的非疾病样等位基因区分开来时，该分析的灵敏度为83%，特异性为84%。随后将该算法应用于所有已知人类基因中的每个碱基，详尽地证实了种间保守性是疾病关联的最强因素。对于每个基因，计算了长度标准化的平均疾病潜力得分。得分最高的30个基因中，有21个与疾病直接相关。相比之下，得分最低的30个基因中，只有一个在已发表的文献中与疾病相关。结果强烈表明，得分最高的基因富集了那些如果发生突变可能导致疾病的基因。

结论

该方法为研究人员提供了有价值的信息，以识别具有高疾病概率的基因中的敏感位置，使他们能够优化实验设计并解释遗传和流行病学研究中出现的数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ec36/2537574/eee7e151d7c0/1471-2105-9-S9-S3-1.jpg

相似文献

Exhaustive prediction of disease susceptibility to coding base changes in the human genome.

BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S3. doi: 10.1186/1471-2105-9-S9-S3.

Common neurodegenerative diseases: dissection by genome-wide association.

Curr Neurol Neurosci Rep. 2007 Sep;7(5):425-7. doi: 10.1007/s11910-007-0065-8.

Accurate prediction of deleterious protein kinase polymorphisms.

Bioinformatics. 2007 Nov 1;23(21):2918-25. doi: 10.1093/bioinformatics/btm437. Epub 2007 Sep 12.

Genotyping over 100,000 SNPs on a pair of oligonucleotide arrays.

Nat Methods. 2004 Nov;1(2):109-11. doi: 10.1038/nmeth718.

SNP@Promoter: a database of human SNPs (single nucleotide polymorphisms) within the putative promoter regions.

BMC Bioinformatics. 2008;9 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2105-9-S1-S2.

SNPeffect v2.0: a new step in investigating the molecular phenotypic effects of human non-synonymous SNPs.

Bioinformatics. 2006 Sep 1;22(17):2183-5. doi: 10.1093/bioinformatics/btl348. Epub 2006 Jun 29.

Ranking single nucleotide polymorphisms by potential deleterious effects.

AMIA Annu Symp Proc. 2008 Nov 6;2008:667-71.

Bioinformatics. 2002;18 Suppl 2:S110-5. doi: 10.1093/bioinformatics/18.suppl_2.s110.

SNP Function Portal: a web database for exploring the function implication of SNP alleles.

Bioinformatics. 2006 Jul 15;22(14):e523-9. doi: 10.1093/bioinformatics/btl241.

Merging personalized and participatory medicine: interpretation of individual genomes.

Stud Health Technol Inform. 2014;202:24-7.

引用本文的文献

Pathogenic nsSNPs that increase the risks of cancers among the Orang Asli and Malays.

Sci Rep. 2021 Aug 9;11(1):16158. doi: 10.1038/s41598-021-95618-y.

Losses of human disease-associated genes in placental mammals.

NAR Genom Bioinform. 2019 Oct 24;2(1):lqz012. doi: 10.1093/nargab/lqz012. eCollection 2020 Mar.

A combined functional annotation score for non-synonymous variants.

Hum Hered. 2012;73(1):47-51. doi: 10.1159/000334984. Epub 2012 Jan 18.

Phylomedicine: an evolutionary telescope to explore and diagnose the universe of disease mutations.

Trends Genet. 2011 Sep;27(9):377-86. doi: 10.1016/j.tig.2011.06.004. Epub 2011 Jul 20.

A knowledge-based weighting framework to boost the power of genome-wide association studies.

PLoS One. 2010 Dec 31;5(12):e14480. doi: 10.1371/journal.pone.0014480.

Proceedings of the 2009 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) conference. Introduction.

BMC Bioinformatics. 2009 Oct 8;10 Suppl 11(Suppl 11):S1. doi: 10.1186/1471-2105-10-S11-S1.

Automated inference of molecular mechanisms of disease from amino acid substitutions.

Bioinformatics. 2009 Nov 1;25(21):2744-50. doi: 10.1093/bioinformatics/btp528. Epub 2009 Sep 3.

Proceedings of the 2008 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) Conference.

BMC Bioinformatics. 2008 Aug 12;9 Suppl 9(Suppl 9):S1. doi: 10.1186/1471-2105-9-S9-S1.

本文引用的文献

A "silent" polymorphism in the MDR1 gene changes substrate specificity.

Science. 2007 Jan 26;315(5811):525-8. doi: 10.1126/science.1135308. Epub 2006 Dec 21.

The comparative toxicogenomics database: a cross-species resource for building chemical-gene interaction networks.

Toxicol Sci. 2006 Aug;92(2):587-95. doi: 10.1093/toxsci/kfl008. Epub 2006 May 4.

Selective pressures at a codon-level predict deleterious mutations in human disease genes.

J Mol Biol. 2006 May 19;358(5):1390-404. doi: 10.1016/j.jmb.2006.02.067. Epub 2006 Mar 15.

Sequence variation in G-protein-coupled receptors: analysis of single nucleotide polymorphisms.

Nucleic Acids Res. 2005 Mar 22;33(5):1710-21. doi: 10.1093/nar/gki311. Print 2005.

Identification and characterization of multi-species conserved sequences.

Genome Res. 2003 Dec;13(12):2507-18. doi: 10.1101/gr.1602203.

SIFT: Predicting amino acid changes that affect protein function.

Nucleic Acids Res. 2003 Jul 1;31(13):3812-4. doi: 10.1093/nar/gkg509.

Multiple sequence alignment with the Clustal series of programs.

Nucleic Acids Res. 2003 Jul 1;31(13):3497-500. doi: 10.1093/nar/gkg500.

Human Gene Mutation Database (HGMD): 2003 update.

Hum Mutat. 2003 Jun;21(6):577-81. doi: 10.1002/humu.10212.

A functional analysis of disease-associated mutations in the androgen receptor gene.

Nucleic Acids Res. 2003 Apr 15;31(8):e42. doi: 10.1093/nar/gng042.

Understanding missense mutations in the BRCA1 gene: an evolutionary approach.

Proc Natl Acad Sci U S A. 2003 Feb 4;100(3):1151-6. doi: 10.1073/pnas.0237285100. Epub 2003 Jan 16.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

对人类基因组编码碱基变化的疾病易感性进行详尽预测。

Exhaustive prediction of disease susceptibility to coding base changes in the human genome.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献