Rocha Artur, Camacho Rui, Ruwaard Jeroen, Riper Heleen
INESC TEC - Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal.
DEI & Faculdade de Engenharia & LIAAD-INESC TEC, Universidade do Porto, Portugal.
Internet Interv. 2018 Mar 13;12:176-180. doi: 10.1016/j.invent.2018.03.003. eCollection 2018 Jun.
Clinical trials of blended Internet-based treatments deliver a wealth of data from various sources, such as self-report questionnaires, diagnostic interviews, treatment platform log files and Ecological Momentary Assessments (EMA). Mining these complex data for clinically relevant patterns is a daunting task for which no definitive best method exists. In this paper, we explore the expressive power of the multi-relational Inductive Logic Programming (ILP) data mining approach, using combined trial data of the EU E-COMPARED depression trial.
We explored the capability of ILP to handle and combine (implicit) multiple relationships in the E-COMPARED data. This data set has the following features that favor ILP analysis: 1) Time reasoning is involved; 2) there is a reasonable amount of explicit useful relations to be analyzed; 3) ILP is capable of building comprehensible models that might be perceived as putative explanations by domain experts; 4) both numerical and statistical models may coexist within ILP models if necessary. In our analyses, we focused on scores of the PHQ-8 self-report questionnaire (which taps depressive symptom severity), and on EMA of mood and various other clinically relevant factors. Both measures were administered during treatment, which lasted between 9 to 16 weeks.
E-COMPARED trial data revealed different individual improvement patterns: PHQ-8 scores suggested that some individuals improved quickly during the first weeks of the treatment, while others improved at a (much) slower pace, or not at all. Combining self-reported Ecological Momentary Assessments (EMA), PHQ-8 scores and log data about the usage of the ICT4D platform in the context of blended care, we set out to unveil possible causes for these different trajectories.
This work complements other studies into alternative data mining approaches to E-COMPARED trial data analysis, which are all aimed to identify clinically meaningful predictors of system use and treatment outcome. Strengths and limitations of the ILP approach given this objective will be discussed.
基于互联网的混合治疗临床试验产生了大量来自各种来源的数据,如自我报告问卷、诊断访谈、治疗平台日志文件和生态瞬时评估(EMA)。从这些复杂数据中挖掘临床相关模式是一项艰巨的任务,目前尚无确定的最佳方法。在本文中,我们使用欧盟E-COMPARED抑郁症试验的综合试验数据,探索多关系归纳逻辑编程(ILP)数据挖掘方法的表达能力。
我们探讨了ILP处理和组合E-COMPARED数据中(隐式)多重关系的能力。该数据集具有以下有利于ILP分析的特征:1)涉及时间推理;2)有相当数量的明确有用关系需要分析;3)ILP能够构建可理解的模型,领域专家可能将其视为假定的解释;4)如有必要,数值模型和统计模型可在ILP模型中共存。在我们的分析中,我们重点关注PHQ-8自我报告问卷的得分(该问卷用于评估抑郁症状严重程度)以及情绪和其他各种临床相关因素的EMA。这两种测量均在持续9至16周的治疗期间进行。
E-COMPARED试验数据揭示了不同的个体改善模式:PHQ-8得分表明,一些个体在治疗的最初几周内迅速改善,而另一些个体改善速度较慢(或根本没有改善)。结合自我报告的生态瞬时评估(EMA)、PHQ-8得分以及关于混合护理背景下ICT4D平台使用情况的数据日志,我们着手揭示这些不同轨迹的可能原因。
这项工作补充了其他关于E-COMPARED试验数据分析的替代数据挖掘方法的研究,这些研究均旨在识别系统使用和治疗结果的临床有意义的预测因素。将讨论基于这一目标的ILP方法的优点和局限性。