Bona B, Tempo R, Belforte G
Dipartimento di Automatica e Informatica, Politecnico di Torino, Italy.
Comput Methods Programs Biomed. 1988 Nov-Dec;27(3):213-21. doi: 10.1016/0169-2607(88)90085-5.
A computer program for sequential bayesian classification of patterns defined by integer and real-valued data is described. Classified patterns from a training sample are used to estimate the non-parametric (kernel) probability density functions and the a-priori class probabilities necessary to implement the bayesian classification. For each pattern and at each step in the sequential program, the 'best' feature to be measured at the next step is computed on the basis of the estimated misallocation error rate. The user can actually use the proposed feature or any other one; once the chosen feature has been measured, its value is used to allocate the pattern into the class with the highest conditional a-posteriori probability, according to the Bayes formula. The main feature of the program consists in the computation of the 'probability of reversal' at each step of the sequential procedure. The probability of reversal represents the probability that at the next step the pattern will be classified into a class different from the present one. The probability of reversal can be used as a stopping criterion, which is more efficient than other commonly used stopping rules, such as the a-posteriori Bayes probability or the estimated misallocation error rate. The program, available in FORTRAN 77 for a VAX/VMS machine, has been tested both on simulated and real data collected from patients suffering from various forms of hepatic disease.
本文描述了一个用于对由整数和实值数据定义的模式进行序贯贝叶斯分类的计算机程序。来自训练样本的分类模式用于估计实施贝叶斯分类所需的非参数(核)概率密度函数和先验类概率。对于序贯程序中的每个模式和每一步,根据估计的错误分类误差率计算下一步要测量的“最佳”特征。用户实际上可以使用提议的特征或任何其他特征;一旦测量了所选特征,其值就用于根据贝叶斯公式将模式分配到具有最高条件后验概率的类中。该程序的主要特征在于在序贯过程的每一步计算“反转概率”。反转概率表示下一步模式将被分类到与当前类不同的类中的概率。反转概率可用作停止准则,它比其他常用的停止规则(如后验贝叶斯概率或估计的错误分类误差率)更有效。该程序用FORTRAN 77编写,可在VAX/VMS机器上运行,已在从患有各种肝病的患者收集的模拟数据和真实数据上进行了测试。