Andrews Bryan, Ramsey Joseph, Sánchez-Romero Rubén, Camchong Jazmin, Kummerfeld Erich
Department of Psychiatry & Behavioral Sciences, University of Minnesota, Minneapolis, MN 55454.
Department of Philosophy, Carnegie Mellon University, Pittsburgh, PA 15213.
Adv Neural Inf Process Syst. 2023 Dec;36:63945-63956. Epub 2024 May 30.
Learning graphical conditional independence structures is an important machine learning problem and a cornerstone of causal discovery. However, the accuracy and execution time of learning algorithms generally struggle to scale to problems with hundreds of highly connected variables-for instance, recovering brain networks from fMRI data. We introduce the best order score search (BOSS) and grow-shrink trees (GSTs) for learning directed acyclic graphs (DAGs) in this paradigm. BOSS greedily searches over permutations of variables, using GSTs to construct and score DAGs from permutations. GSTs efficiently cache scores to eliminate redundant calculations. BOSS achieves state-of-the-art performance in accuracy and execution time, comparing favorably to a variety of combinatorial and gradient-based learning algorithms under a broad range of conditions. To demonstrate its practicality, we apply BOSS to two sets of resting-state fMRI data: simulated data with pseudo-empirical noise distributions derived from randomized empirical fMRI cortical signals and clinical data from 3T fMRI scans processed into cortical parcels. BOSS is available for use within the TETRAD project which includes Python and R wrappers.
学习图形条件独立结构是一个重要的机器学习问题,也是因果发现的基石。然而,学习算法的准确性和执行时间通常难以扩展到具有数百个高度连接变量的问题——例如,从功能磁共振成像(fMRI)数据中恢复脑网络。在此范式下,我们引入了最佳顺序分数搜索(BOSS)和生长收缩树(GST)来学习有向无环图(DAG)。BOSS通过GST从变量排列中构建和评分DAG,贪婪地搜索变量排列。GST有效地缓存分数以消除冗余计算。在广泛的条件下,与各种组合和基于梯度的学习算法相比,BOSS在准确性和执行时间方面都达到了当前的最佳性能。为了证明其实用性,我们将BOSS应用于两组静息态fMRI数据:一组是具有从随机经验性fMRI皮质信号导出的伪经验噪声分布的模拟数据,另一组是经过处理成为皮质包裹的3T fMRI扫描临床数据。BOSS可在TETRAD项目中使用,该项目包括Python和R包装器。