Department of Mathematical Sciences, UW-Milwaukee, Milwaukee, WI, United States of America.
Department of Mathematics and Statistics, Utah State University, Logan, UT, United States of America.
PLoS One. 2021 May 3;16(5):e0250963. doi: 10.1371/journal.pone.0250963. eCollection 2021.
Time-to-event analysis is a common occurrence in political science. In recent years, there has been an increased usage of machine learning methods in quantitative political science research. This article advocates for the implementation of machine learning duration models to assist in a sound model selection process. We provide a brief tutorial introduction to the random survival forest (RSF) algorithm and contrast it to a popular predecessor, the Cox proportional hazards model, with emphasis on methodological utility for political science researchers. We implement both methods for simulated time-to-event data and the Power-Sharing Event Dataset (PSED) to assist researchers in evaluating the merits of machine learning duration models. We provide evidence of significantly higher survival probabilities for peace agreements with 3rd party mediated design and implementation. We also detect increased survival probabilities for peace agreements that incorporate territorial power-sharing and avoid multiple rebel party signatories. Further, the RSF, a previously under-used method for analyzing political science time-to event data, provides a novel approach for ranking of peace agreement criteria importance in predicting peace agreement duration. Our findings demonstrate a scenario exhibiting the interpretability and performance of RSF for political science time-to-event data. These findings justify the robust interpretability and competitive performance of the random survival forest algorithm in numerous circumstances, in addition to promoting a diverse, holistic model-selection process for time-to-event political science data.
时间事件分析在政治学中很常见。近年来,机器学习方法在定量政治学研究中得到了越来越多的应用。本文主张在合理的模型选择过程中实施机器学习生存模型。我们提供了一个随机生存森林(RSF)算法的简要教程介绍,并将其与流行的前身——Cox 比例风险模型进行了对比,重点介绍了其对政治学研究人员的方法学实用性。我们对模拟时间事件数据和权力共享事件数据集(PSED)实施了这两种方法,以帮助研究人员评估机器学习生存模型的优点。我们提供了证据表明,有第三方调解设计和实施的和平协议具有更高的生存概率。我们还发现,纳入领土权力共享并避免多个叛乱方签署方的和平协议具有更高的生存概率。此外,RSF 是一种以前在分析政治学时间事件数据中使用较少的方法,它为根据和平协议持续时间预测和平协议标准重要性的排名提供了一种新方法。我们的研究结果展示了 RSF 对政治学时间事件数据的可解释性和性能的情景。这些发现证明了随机生存森林算法在许多情况下具有强大的可解释性和竞争力,此外还促进了时间事件政治学数据的多样化、整体模型选择过程。