IEEE J Biomed Health Inform. 2018 Jul;22(4):1197-1208. doi: 10.1109/JBHI.2017.2732287. Epub 2017 Jul 26.
Multi-modality data convey complementary information that can be used to improve the accuracy of prediction models in disease diagnosis. However, effectively integrating multi-modality data remains a challenging problem, especially when the data are incomplete. For instance, more than half of the subjects in the Alzheimer's disease neuroimaging initiative (ADNI) database have no fluorodeoxyglucose positron emission tomography and cerebrospinal fluid data. Currently, there are two commonly used strategies to handle the problem of incomplete data: 1) discard samples having missing features; and 2) impute those missing values via specific techniques. In the first case, a significant amount of useful information is lost and, in the second case, additional noise and artifacts might be introduced into the data. Also, previous studies generally focus on the pairwise relationships among subjects, without considering their underlying complex (e.g., high-order) relationships. To address these issues, in this paper, we propose a multi-hypergraph learning method for dealing with incomplete multimodality data. Specifically, we first construct multiple hypergraphs to represent the high-order relationships among subjects by dividing them into several groups according to the availability of their data modalities. A hypergraph regularized transductive learning method is then applied to these groups for automatic diagnosis of brain diseases. Extensive evaluation of the proposed method using all subjects in the baseline ADNI database indicates that our method achieves promising results in AD/MCI classification, compared with the state-of-the-art methods.
多模态数据传递互补信息,可以用于提高疾病诊断中预测模型的准确性。然而,有效地整合多模态数据仍然是一个具有挑战性的问题,特别是当数据不完整时。例如,阿尔茨海默病神经影像学倡议 (ADNI) 数据库中超过一半的受试者没有氟脱氧葡萄糖正电子发射断层扫描和脑脊液数据。目前,有两种常用的策略来处理不完整数据的问题:1)丢弃具有缺失特征的样本;2)通过特定技术对这些缺失值进行插补。在第一种情况下,会丢失大量有用的信息,而在第二种情况下,数据中可能会引入额外的噪声和伪影。此外,以前的研究通常侧重于受试者之间的两两关系,而没有考虑到它们潜在的复杂(例如,高阶)关系。为了解决这些问题,在本文中,我们提出了一种用于处理不完整多模态数据的多超图学习方法。具体来说,我们首先根据数据模态的可用性将受试者分为几个组,通过构建多个超图来表示受试者之间的高阶关系。然后,将超图正则化的转导学习方法应用于这些组中,以实现大脑疾病的自动诊断。使用基线 ADNI 数据库中的所有受试者对所提出的方法进行广泛评估表明,与最先进的方法相比,我们的方法在 AD/MCI 分类方面取得了有前景的结果。