Katz Daniel Martin, Bommarito Michael J, Blackman Josh
Illinois Tech - Chicago-Kent College of Law, Chicago, IL, United States of America.
CodeX - The Stanford Center for Legal Informatics, Stanford, CA, United States of America.
PLoS One. 2017 Apr 12;12(4):e0174698. doi: 10.1371/journal.pone.0174698. eCollection 2017.
Building on developments in machine learning and prior work in the science of judicial prediction, we construct a model designed to predict the behavior of the Supreme Court of the United States in a generalized, out-of-sample context. To do so, we develop a time-evolving random forest classifier that leverages unique feature engineering to predict more than 240,000 justice votes and 28,000 cases outcomes over nearly two centuries (1816-2015). Using only data available prior to decision, our model outperforms null (baseline) models at both the justice and case level under both parametric and non-parametric tests. Over nearly two centuries, we achieve 70.2% accuracy at the case outcome level and 71.9% at the justice vote level. More recently, over the past century, we outperform an in-sample optimized null model by nearly 5%. Our performance is consistent with, and improves on the general level of prediction demonstrated by prior work; however, our model is distinctive because it can be applied out-of-sample to the entire past and future of the Court, not a single term. Our results represent an important advance for the science of quantitative legal prediction and portend a range of other potential applications.
基于机器学习的发展以及司法预测科学领域的先前研究成果,我们构建了一个旨在在广义的样本外情境中预测美国最高法院行为的模型。为此,我们开发了一种随时间演变的随机森林分类器,该分类器利用独特的特征工程方法,对近两个世纪(1816年至2015年)超过24万次大法官投票和2.8万个案件结果进行预测。仅使用判决前可得的数据,我们的模型在参数检验和非参数检验下,在大法官层面和案件层面均优于零假设(基线)模型。在近两个世纪里,我们在案件结果层面的准确率达到70.2%,在大法官投票层面达到71.9%。最近,在过去的一个世纪里,我们的表现比样本内优化的零假设模型高出近5%。我们的表现与先前研究展示的总体预测水平相符且有所提升;然而,我们的模型与众不同之处在于它可以样本外应用于最高法院的整个过去和未来,而非单个任期。我们的结果代表了定量法律预测科学的一项重要进展,并预示着一系列其他潜在应用。