Constantinou Anthony Costa, Fenton Norman, Marsh William, Radlinski Lukasz
Risk and Information Management Research Group, School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Campus, Mile End Road, Computer Science Building, E1 4NS London, UK.
Risk and Information Management Research Group, School of Electronic Engineering and Computer Science, Queen Mary University of London, Mile End Campus, Mile End Road, Computer Science Building, E1 4NS London, UK.
Artif Intell Med. 2016 Feb;67:75-93. doi: 10.1016/j.artmed.2016.01.002. Epub 2016 Jan 16.
(1) To develop a rigorous and repeatable method for building effective Bayesian network (BN) models for medical decision support from complex, unstructured and incomplete patient questionnaires and interviews that inevitably contain examples of repetitive, redundant and contradictory responses; (2) To exploit expert knowledge in the BN development since further data acquisition is usually not possible; (3) To ensure the BN model can be used for interventional analysis; (4) To demonstrate why using data alone to learn the model structure and parameters is often unsatisfactory even when extensive data is available.
The method is based on applying a range of recent BN developments targeted at helping experts build BNs given limited data. While most of the components of the method are based on established work, its novelty is that it provides a rigorous consolidated and generalised framework that addresses the whole life-cycle of BN model development. The method is based on two original and recent validated BN models in forensic psychiatry, known as DSVM-MSS and DSVM-P.
When employed with the same datasets, the DSVM-MSS demonstrated competitive to superior predictive performance (AUC scores 0.708 and 0.797) against the state-of-the-art (AUC scores ranging from 0.527 to 0.705), and the DSVM-P demonstrated superior predictive performance (cross-validated AUC score of 0.78) against the state-of-the-art (AUC scores ranging from 0.665 to 0.717). More importantly, the resulting models go beyond improving predictive accuracy and into usefulness for risk management purposes through intervention, and enhanced decision support in terms of answering complex clinical questions that are based on unobserved evidence.
This development process is applicable to any application domain which involves large-scale decision analysis based on such complex information, rather than based on data with hard facts, and in conjunction with the incorporation of expert knowledge for decision support via intervention. The novelty extends to challenging the decision scientists to reason about building models based on what information is really required for inference, rather than based on what data is available and hence, forces decision scientists to use available data in a much smarter way.
(1)开发一种严谨且可重复的方法,用于从复杂、无结构且不完整的患者问卷和访谈中构建有效的贝叶斯网络(BN)模型,这些问卷和访谈不可避免地包含重复、冗余和矛盾的回答示例;(2)在贝叶斯网络开发中利用专家知识,因为通常无法进行进一步的数据采集;(3)确保贝叶斯网络模型可用于干预分析;(4)证明为何即使有大量数据,仅使用数据来学习模型结构和参数往往也不尽人意。
该方法基于应用一系列近期的贝叶斯网络进展,旨在帮助专家在数据有限的情况下构建贝叶斯网络。虽然该方法的大多数组件基于已有的工作,但其新颖之处在于它提供了一个严谨的、整合的和通用的框架,涵盖了贝叶斯网络模型开发的整个生命周期。该方法基于法医精神病学中两个最新且经过验证的原始贝叶斯网络模型,即DSVM - MSS和DSVM - P。
当与相同的数据集一起使用时,DSVM - MSS相对于当前最先进的方法(AUC分数范围为0.527至0.705)展示出具有竞争力至卓越的预测性能(AUC分数为0.708和0.797),并且DSVM - P相对于当前最先进的方法(AUC分数范围为0.665至0.717)展示出卓越的预测性能(交叉验证的AUC分数为0.78)。更重要的是,所得模型不仅提高了预测准确性,还通过干预在风险管理方面变得有用,并在基于未观察到的证据回答复杂临床问题方面增强了决策支持。
此开发过程适用于任何涉及基于此类复杂信息而非基于确凿事实的数据进行大规模决策分析,并通过干预结合专家知识进行决策支持的应用领域。其新颖之处还在于促使决策科学家思考基于推理真正需要的信息来构建模型,而不是基于可用的数据,从而迫使决策科学家以更明智的方式使用可用数据。