Faculty of Computer Engineering, University of Isfahan, Isfahan, Iran.
Department of Biomedical Engineering, School of Advanced Medical Technology, Isfahan University of Medical Sciences, Isfahan, Iran.
BMC Med Inform Decis Mak. 2024 Oct 30;24(1):319. doi: 10.1186/s12911-024-02668-z.
DNA microarrays provide informative data for transcriptional profiling and identifying gene expression signatures to help prevent progression of latent tuberculosis infection (LTBI) to active disease. However, constructing a prognostic model for distinguishing LTBI from active tuberculosis (ATB) is very challenging due to the noisy nature of data and lack of a generally stable analysis approach.
In the present study, we proposed an accurate predictive model with the help of data fusion at the decision level. In this regard, results of filter feature selection and wrapper feature selection techniques were combined with multiple-criteria decision-making (MCDM) methods to select 10 genes from six microarray datasets that can be the most discriminative genes for diagnosing tuberculosis cases. As the main contribution of this study, the final ranking function was constructed by combining protein-protein interaction (PPI) network with an MCDM method (called Decision-making Trial and Evaluation Laboratory or DEMATEL) to improve the feature ranking approach.
By applying data fusion at the decision level on the 10 introduced genes in terms of fusion of classifiers of random forests (RF) and k-nearest neighbors (KNN) regarding Yager's theory, the proposed algorithm reached a sensitivity of 0.97, specificity of 0.90, and accuracy of 0.95. Finally, with the help of cumulative clustering, the genes involved in the diagnosis of latent and activated tuberculosis have been introduced.
The combination of MCDM methods and PPI networks can significantly improve the diagnosis different states of tuberculosis.
Not applicable.
DNA 微阵列为转录谱分析和识别基因表达特征提供了信息丰富的数据,有助于防止潜伏性结核感染(LTBI)向活动性疾病进展。然而,由于数据的噪声性质和缺乏普遍稳定的分析方法,构建用于区分 LTBI 和活动性肺结核(ATB)的预后模型非常具有挑战性。
在本研究中,我们提出了一个准确的预测模型,借助于决策级的数据融合。在这方面,过滤特征选择和包装特征选择技术的结果与多准则决策(MCDM)方法相结合,从六个微阵列数据集选择了 10 个基因,这些基因可能是诊断结核病病例最具鉴别力的基因。作为本研究的主要贡献,最终的排序函数是通过将蛋白质-蛋白质相互作用(PPI)网络与 MCDM 方法(称为决策试验和评价实验室或 DEMATEL)相结合构建的,以改进特征排序方法。
通过在决策级对 10 个引入基因应用数据融合,根据 Yager 理论融合随机森林(RF)和 K-最近邻(KNN)的分类器,所提出的算法达到了 0.97 的灵敏度、0.90 的特异性和 0.95 的准确性。最后,通过累积聚类,介绍了参与潜伏性和活动性肺结核诊断的基因。
MCDM 方法和 PPI 网络的结合可以显著提高结核病不同状态的诊断能力。
不适用。