Department of Computer Science and Engineering, Korea University, Seoul, South Korea.
NAVER Corp., Seongnam, South Korea.
Neural Netw. 2021 Aug;140:27-38. doi: 10.1016/j.neunet.2021.02.018. Epub 2021 Mar 3.
Although neural models have performed impressively well on various tasks such as image recognition and question answering, their reasoning ability has been measured in only few studies. In this work, we focus on spatial reasoning and explore the spatial understanding of neural models. First, we describe the following two spatial reasoning IQ tests: rotation and shape composition. Using well-defined rules, we constructed datasets that consist of various complexity levels. We designed a variety of experiments in terms of generalization, and evaluated six different baseline models on the newly generated datasets. We provide an analysis of the results and factors that affect the generalization abilities of models. Also, we analyze how neural models solve spatial reasoning tests with visual aids. We hope that our work can encourage further research into human-level spatial reasoning and provide a new direction for future work.
虽然神经模型在图像识别和问答等各种任务上表现出色,但它们的推理能力在很少的研究中得到了衡量。在这项工作中,我们专注于空间推理,并探索神经模型的空间理解。首先,我们描述了以下两个空间推理智商测试:旋转和形状组合。我们使用明确的规则构建了数据集,这些数据集包含不同的复杂程度。我们根据泛化设计了各种实验,并在新生成的数据集上评估了六个不同的基线模型。我们对结果和影响模型泛化能力的因素进行了分析。此外,我们还分析了神经模型如何使用视觉辅助工具解决空间推理测试。我们希望我们的工作可以鼓励对人类水平的空间推理的进一步研究,并为未来的工作提供新的方向。