Peng Yi-Ting, Lei Chin-Laung
Department of Electrical Engineering, National Taiwan University, Taipei City, Taiwan.
PeerJ Comput Sci. 2024 Jan 31;10:e1841. doi: 10.7717/peerj-cs.1841. eCollection 2024.
People unfamiliar with the law may not know what kind of behavior is considered criminal behavior or the lengths of sentences tied to those behaviors. This study used criminal judgments from the district court in Taiwan to predict the type of crime and sentence length that would be determined. This study pioneers using Taiwanese criminal judgments as a dataset and proposes improvements based on Bidirectional Encoder Representations from Transformers (BERT). This study is divided into two parts: criminal charges prediction and sentence prediction. Injury and public endangerment judgments were used as training data to predict sentences. This study also proposes an effective solution to BERT's 512-token limit. The results show that using the BERT model to train Taiwanese criminal judgments is feasible. Accuracy reached 98.95% in predicting criminal charges and 72.37% in predicting the sentence in injury trials, and 80.93% in predicting the sentence in public endangerment trials.
不熟悉法律的人可能不知道什么样的行为被视为犯罪行为,也不知道与这些行为相关的刑期长短。本研究使用台湾地区地方法院的刑事判决来预测将会判定的犯罪类型和刑期长短。本研究率先将台湾刑事判决用作数据集,并基于变换器双向编码器表征(BERT)提出改进方法。本研究分为两部分:刑事指控预测和刑期预测。伤害罪和危害公共安全罪的判决被用作训练数据来预测刑期。本研究还针对BERT的512词元限制提出了一种有效的解决方案。结果表明,使用BERT模型训练台湾刑事判决是可行的。在预测刑事指控方面准确率达到98.95%,在伤害罪审判中预测刑期的准确率为72.37%,在危害公共安全罪审判中预测刑期的准确率为80.93%。