Suppr超能文献

一种新的针对癌症致错义变异预测的疾病特异性机器学习方法。

A new disease-specific machine learning approach for the prediction of cancer-causing missense variants.

机构信息

Department of Bioengineering, Stanford University, Stanford, CA 94305, USA.

出版信息

Genomics. 2011 Oct;98(4):310-7. doi: 10.1016/j.ygeno.2011.06.010. Epub 2011 Jul 7.

Abstract

High-throughput genotyping and sequencing techniques are rapidly and inexpensively providing large amounts of human genetic variation data. Single Nucleotide Polymorphisms (SNPs) are an important source of human genome variability and have been implicated in several human diseases, including cancer. Amino acid mutations resulting from non-synonymous SNPs in coding regions may generate protein functional changes that affect cell proliferation. In this study, we developed a machine learning approach to predict cancer-causing missense variants. We present a Support Vector Machine (SVM) classifier trained on a set of 3163 cancer-causing variants and an equal number of neutral polymorphisms. The method achieve 93% overall accuracy, a correlation coefficient of 0.86, and area under ROC curve of 0.98. When compared with other previously developed algorithms such as SIFT and CHASM our method results in higher prediction accuracy and correlation coefficient in identifying cancer-causing variants.

摘要

高通量基因分型和测序技术正在快速、廉价地提供大量人类遗传变异数据。单核苷酸多态性(SNP)是人类基因组变异的重要来源,与多种人类疾病有关,包括癌症。编码区非同义 SNP 导致的氨基酸突变可能会产生影响细胞增殖的蛋白质功能变化。在这项研究中,我们开发了一种机器学习方法来预测致癌错义变异。我们提出了一种基于 3163 种致癌变异和等量中性多态性的支持向量机(SVM)分类器。该方法的整体准确率为 93%,相关系数为 0.86,ROC 曲线下面积为 0.98。与 SIFT 和 CHASM 等其他先前开发的算法相比,我们的方法在识别致癌变异方面具有更高的预测准确性和相关系数。

相似文献

2
Collective judgment predicts disease-associated single nucleotide variants.群体判断可预测与疾病相关的单核苷酸变异。
BMC Genomics. 2013;14 Suppl 3(Suppl 3):S2. doi: 10.1186/1471-2164-14-S3-S2. Epub 2013 May 28.
3
Identifying novel oncogenes: a machine learning approach.鉴定新的癌基因:一种机器学习方法。
Interdiscip Sci. 2013 Dec;5(4):241-6. doi: 10.1007/s12539-013-0151-3. Epub 2014 Jan 10.
6
Identifying Mendelian disease genes with the variant effect scoring tool.使用变异效应评分工具鉴定孟德尔疾病基因。
BMC Genomics. 2013;14 Suppl 3(Suppl 3):S3. doi: 10.1186/1471-2164-14-S3-S3. Epub 2013 May 28.

引用本文的文献

7
Machine Learning Predictions of Cancer Driver Mutations.癌症驱动基因突变的机器学习预测
Proc 2014 6th Int Adv Res Workshop In Silico Oncol Cancer Investig (2014). 2014 Nov;2014. doi: 10.1109/iarwisoci.2014.7034632.

本文引用的文献

1
Bioinformatics challenges for personalized medicine.个性化医学的生物信息学挑战。
Bioinformatics. 2011 Jul 1;27(13):1741-8. doi: 10.1093/bioinformatics/btr295. Epub 2011 May 19.
5
Recent advances in neuroblastoma.神经母细胞瘤的最新进展
N Engl J Med. 2010 Jun 10;362(23):2202-11. doi: 10.1056/NEJMra0804577.
8
Over-optimism in bioinformatics research.生物信息学研究中的过度乐观情绪。
Bioinformatics. 2010 Feb 1;26(3):437-9. doi: 10.1093/bioinformatics/btp648. Epub 2009 Nov 26.
9
The Pfam protein families database.Pfam 蛋白质家族数据库。
Nucleic Acids Res. 2010 Jan;38(Database issue):D211-22. doi: 10.1093/nar/gkp985. Epub 2009 Nov 17.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验