Ikejiri Ryohei, Sumikawa Yasunobu
The University of Tokyo, Japan.
Tokyo Metropolitan University, Japan.
Data Brief. 2020 Jan 27;29:105185. doi: 10.1016/j.dib.2020.105185. eCollection 2020 Apr.
In this data article, we present a dataset that includes past causalities and categories to connect similar past and present causalities. First, we collect past causalities by referencing certain well-known Japanese high-school textbooks. Subsequently, we select 138 causalities that are useful for analogizing from the causalities to considering solutions for confront present social issues. To enhance the analogy, we describe each causality in three contexts: background including problems, solution methods, and their results. We define 13 categories based on the selected causalities and Encyclopedia of Historiography. The past causalities belong to more than one category. In addition, to train machine learning models including classifier, we collect 900 past events from Wikipedia, and assign one or more categories to the past event data. We perform statistical analyses to understand the quality of the dataset. The proposed applications of the dataset include training machine learning models such as classifiers for past causalities and information retrieval for ranking present social issues according to the similarities between the present and past causalities.
在本数据文章中,我们展示了一个数据集,其中包含过去的因果关系及类别,用于连接相似的过去和当前的因果关系。首先,我们通过参考某些著名的日本高中教科书来收集过去的因果关系。随后,我们从这些因果关系中挑选出138个对类比以及思考应对当前社会问题的解决方案有用的因果关系。为了加强类比,我们在三种情境下描述每个因果关系:包括问题的背景、解决方法及其结果。我们根据所选的因果关系和《历史编纂百科全书》定义了13个类别。过去的因果关系属于不止一个类别。此外,为了训练包括分类器在内的机器学习模型,我们从维基百科收集了900个过去的事件,并为这些过去的事件数据分配一个或多个类别。我们进行统计分析以了解数据集的质量。该数据集的拟议应用包括训练机器学习模型,如用于过去因果关系的分类器,以及根据当前与过去因果关系之间的相似性对当前社会问题进行排名的信息检索。