Suppr超能文献

贝叶斯荧光原位杂交信号分类

Bayesian fluorescence in situ hybridisation signal classification.

作者信息

Lerner Boaz

机构信息

Pattern Analysis & Machine Learning Lab, Department of Electrical & Computer Engineering, Ben-Gurion University, Beer-Sheva, Israel.

出版信息

Artif Intell Med. 2004 Mar;30(3):301-16. doi: 10.1016/j.artmed.2003.11.005.

Abstract

Previous research has indicated the significance of accurate classification of fluorescence in situ hybridisation (FISH) signals for the detection of genetic abnormalities. Based on well-discriminating features and a trainable neural network (NN) classifier, a previous system enabled highly-accurate classification of valid signals and artefacts of two fluorophores. However, since this system employed several features that are considered independent, the naive Bayesian classifier (NBC) is suggested here as an alternative to the NN. The NBC independence assumption permits the decomposition of the high-dimensional likelihood of the model for the data into a product of one-dimensional probability densities. The naive independence assumption together with the Bayesian methodology allow the NBC to predict a posteriori probabilities of class membership using estimated class-conditional densities in a close and simple form. Since the probability densities are the only parameters of the NBC, the misclassification rate of the model is determined exclusively by the quality of density estimation. Densities are evaluated by three methods: single Gaussian estimation (SGE; parametric method), Gaussian mixture model assuming spherical covariance matrices (GMM; semi-parametric method) and kernel density estimation (KDE; non-parametric method). For low-dimensional densities, the GMM generally outperforms the KDE that tends to overfit the training set at the cost of reduced generalisation capability. But, it is the GMM that loses some accuracy when modelling higher-dimensional densities due to the violation of the assumption of spherical covariance matrices when dependent features are added to the set. Compared with these two methods, the SGE and NN provide inferior and superior performance, respectively. However, the NBC avoids the intensive training and optimisation required for the NN, demanding extensive resources and experimentation. Therefore, when supporting these two classifiers, the system enables a trade-off between the NN performance and NBC simplicity of implementation.

摘要

先前的研究表明,荧光原位杂交(FISH)信号的准确分类对于检测基因异常具有重要意义。基于良好的区分特征和可训练的神经网络(NN)分类器,先前的一个系统能够对两种荧光团的有效信号和伪像进行高精度分类。然而,由于该系统采用了几个被认为是独立的特征,本文建议使用朴素贝叶斯分类器(NBC)作为NN的替代方案。NBC独立性假设允许将模型对数据的高维似然性分解为一维概率密度的乘积。朴素独立性假设与贝叶斯方法一起,使NBC能够以一种简洁的形式使用估计的类条件密度来预测类成员的后验概率。由于概率密度是NBC的唯一参数,模型的错误分类率完全由密度估计的质量决定。密度通过三种方法进行评估:单高斯估计(SGE;参数方法)、假设球形协方差矩阵的高斯混合模型(GMM;半参数方法)和核密度估计(KDE;非参数方法)。对于低维密度,GMM通常优于KDE,KDE往往以降低泛化能力为代价过度拟合训练集。但是,当对高维密度进行建模时,由于在集合中添加了相关特征而违反了球形协方差矩阵的假设,GMM会失去一些准确性。与这两种方法相比,SGE和NN分别提供较差和较好的性能。然而,NBC避免了NN所需的密集训练和优化,而这需要大量资源和实验。因此,在支持这两种分类器时,该系统能够在NN性能和NBC实现的简单性之间进行权衡。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验