• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

根据氨基酸类型对数据集进行划分,可提高有害非同义 SNP 的预测能力。

Partition dataset according to amino acid type improves the prediction of deleterious non-synonymous SNPs.

机构信息

School of Biotechnology, East China University of Science and Technology, Shanghai 200237, China.

出版信息

Biochem Biophys Res Commun. 2012 Mar 2;419(1):99-103. doi: 10.1016/j.bbrc.2012.01.138. Epub 2012 Feb 4.

DOI:10.1016/j.bbrc.2012.01.138
PMID:22326261
Abstract

Many non-synonymous SNPs (nsSNPs) are associated with diseases, and numerous machine learning methods have been applied to train classifiers for sorting disease-associated nsSNPs from neutral ones. The continuously accumulated nsSNP data allows us to further explore better prediction approaches. In this work, we partitioned the training data into 20 subsets according to either original or substituted amino acid type at the nsSNP site. Using support vector machine (SVM), training classification models on each subset resulted in an overall accuracy of 76.3% or 74.9% depending on the two different partition criteria, while training on the whole dataset obtained an accuracy of only 72.6%. Moreover, the dataset was also randomly divided into 20 subsets, but the corresponding accuracy was only 73.2%. Our results demonstrated that partitioning the whole training dataset into subsets properly, i.e., according to the residue type at the nsSNP site, will improve the performance of the trained classifiers significantly, which should be valuable in developing better tools for predicting the disease-association of nsSNPs.

摘要

许多非同义 SNP(nsSNP)与疾病相关,许多机器学习方法已被应用于训练分类器,以将与疾病相关的 nsSNP 与中性 SNP 区分开来。不断积累的 nsSNP 数据使我们能够进一步探索更好的预测方法。在这项工作中,我们根据 nsSNP 位点的原始或取代氨基酸类型,将训练数据分为 20 个子集。使用支持向量机(SVM),在每个子集中训练分类模型,得到的整体准确率分别为 76.3%或 74.9%,这取决于两种不同的分区标准,而在整个数据集上训练的准确率仅为 72.6%。此外,我们还将数据集随机分为 20 个子集,但相应的准确率仅为 73.2%。我们的结果表明,将整个训练数据集适当地划分为子集,即根据 nsSNP 位点的残基类型,将显著提高训练分类器的性能,这对于开发更好的预测 nsSNP 疾病相关性的工具应该是有价值的。

相似文献

1
Partition dataset according to amino acid type improves the prediction of deleterious non-synonymous SNPs.根据氨基酸类型对数据集进行划分,可提高有害非同义 SNP 的预测能力。
Biochem Biophys Res Commun. 2012 Mar 2;419(1):99-103. doi: 10.1016/j.bbrc.2012.01.138. Epub 2012 Feb 4.
2
Statistical geometry based prediction of nonsynonymous SNP functional effects using random forest and neuro-fuzzy classifiers.基于统计几何学,使用随机森林和神经模糊分类器预测非同义单核苷酸多态性的功能效应
Proteins. 2008 Jun;71(4):1930-9. doi: 10.1002/prot.21838.
3
Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information.利用结构和进化信息预测非同义单核苷酸多态性的表型效应。
Bioinformatics. 2005 May 15;21(10):2185-90. doi: 10.1093/bioinformatics/bti365. Epub 2005 Mar 3.
4
Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information.利用支持向量机和进化信息预测与单点蛋白质突变相关的人类遗传疾病的发生。
Bioinformatics. 2006 Nov 15;22(22):2729-34. doi: 10.1093/bioinformatics/btl423. Epub 2006 Aug 7.
5
Knowledge-based computational mutagenesis for predicting the disease potential of human non-synonymous single nucleotide polymorphisms.基于知识的计算突变分析预测人类非同义单核苷酸多态性的疾病潜能。
J Theor Biol. 2010 Oct 21;266(4):560-8. doi: 10.1016/j.jtbi.2010.07.026. Epub 2010 Jul 23.
6
Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines.基于支持向量机预测非同义单核苷酸多态性的表型效应。
BMC Bioinformatics. 2007 Nov 16;8:450. doi: 10.1186/1471-2105-8-450.
7
Mixture classification model based on clinical markers for breast cancer prognosis.基于临床标志物的乳腺癌预后混合分类模型。
Artif Intell Med. 2010 Feb-Mar;48(2-3):129-37. doi: 10.1016/j.artmed.2009.07.008. Epub 2009 Dec 14.
8
Proteins and domains vary in their tolerance of non-synonymous single nucleotide polymorphisms (nsSNPs).蛋白质和结构域在其对非同义单核苷酸多态性(nsSNPs)的容忍度方面存在差异。
J Mol Biol. 2013 Apr 26;425(8):1274-86. doi: 10.1016/j.jmb.2013.01.026. Epub 2013 Jan 25.
9
Predicting deleterious non-synonymous single nucleotide polymorphisms in signal peptides based on hybrid sequence attributes.基于混合序列属性预测信号肽中的有害非同义单核苷酸多态性。
Comput Biol Chem. 2012 Feb;36:31-5. doi: 10.1016/j.compbiolchem.2011.12.001. Epub 2011 Dec 30.
10
Predicting deleterious nsSNPs: an analysis of sequence and structural attributes.预测有害的非同义单核苷酸多态性:序列和结构属性分析
BMC Bioinformatics. 2006 Apr 21;7:217. doi: 10.1186/1471-2105-7-217.