Department of Biomedical Informatics, University of Utah School of Medicine, Salt Lake City, Utah, USA.
J Am Med Inform Assoc. 2012 Mar-Apr;19(2):207-11. doi: 10.1136/amiajnl-2011-000309. Epub 2011 Oct 28.
The rapid advance of gene sequencing technologies has produced an unprecedented rate of discovery of genome variation in humans. A growing number of authoritative clinical repositories archive gene variants and disease phenotypes, yet there are currently many more gene variants that lack clear annotation or disease association. To date, there has been very limited coverage of gene-specific predictors in the literature. Here the evaluation is presented of "gene-specific" predictor models based on a naïve Bayesian classifier for 20 gene-disease datasets, containing 3986 variants with clinically characterized patient conditions. The utility of gene-specific prediction is then compared with "all-gene" generalized prediction and also with existing popular predictors. Gene-specific computational prediction models derived from clinically curated gene variant disease datasets often outperform established generalized algorithms for novel and uncertain gene variants.
基因测序技术的快速发展使得人类基因组变异的发现速度前所未有。越来越多的权威临床存储库存储基因变异和疾病表型,但目前仍有许多基因变异缺乏明确的注释或疾病关联。迄今为止,文献中对基因特异性预测因子的报道非常有限。在这里,我们基于朴素贝叶斯分类器对 20 个基因-疾病数据集进行了“基因特异性”预测模型的评估,其中包含 3986 个具有临床特征的患者情况的变体。然后,将基因特异性预测的效用与“全基因”广义预测以及现有的流行预测因子进行了比较。从临床编辑的基因变异疾病数据集得出的基因特异性计算预测模型通常优于为新的和不确定的基因变异而建立的通用算法。