Mirzababaei Behzad, Pammer-Schindler Viktoria
Know-Center GmbH, Graz, Austria.
Institute for Interactive Systems and Data Science, Graz University of Technology, Graz, Austria.
Front Artif Intell. 2021 Nov 30;4:645516. doi: 10.3389/frai.2021.645516. eCollection 2021.
This article discusses the usefulness of Toulmin's model of arguments as structuring an assessment of different types of wrongness in an argument. We discuss the usability of the model within a conversational agent that aims to support users to develop a good argument. Within the article, we present a study and the development of classifiers that identify the existence of structural components in a good argument, namely a claim, a warrant (underlying understanding), and evidence. Based on a dataset (three sub-datasets with 100, 1,026, 211 responses in each) in which users argue about the intelligence or non-intelligence of entities, we have developed classifiers for these components: The existence and direction (positive/negative) of claims can be detected a weighted average F1 score over all classes (positive/negative/unknown) of 0.91. The existence of a warrant (with warrant/without warrant) can be detected with a weighted F1 score over all classes of 0.88. The existence of evidence (with evidence/without evidence) can be detected with a weighted average F1 score of 0.80. We argue that these scores are high enough to be of use within a conditional dialogue structure based on Bloom's taxonomy of learning; and show by argument an example conditional dialogue structure that allows us to conduct coherent learning conversations. While in our described experiments, we show how Toulmin's model of arguments can be used to identify structural problems with argumentation, we also discuss how Toulmin's model of arguments could be used in conjunction with content-wise assessment of the correctness especially of the evidence component to identify more complex types of wrongness in arguments, where argument components are not well aligned. Owing to having progress in argument mining and conversational agents, the next challenges could be the developing agents that support learning argumentation. These agents could identify more complex type of wrongness in arguments that result from wrong connections between argumentation components.
本文讨论了图尔敏论证模型在构建对论证中不同类型错误性评估方面的作用。我们探讨了该模型在旨在支持用户构建良好论证的对话代理中的可用性。在本文中,我们展示了一项研究以及分类器的开发,这些分类器用于识别良好论证中的结构成分,即主张、依据(潜在理解)和证据。基于一个数据集(三个子数据集,每个子数据集分别有100、1026、211条回复),其中用户围绕实体的智能与否进行论证,我们针对这些成分开发了分类器:主张的存在及其方向(正/负)能够被检测到,所有类别(正/负/未知)的加权平均F1分数为0.91。依据(有依据/无依据)的存在能够以所有类别的加权F1分数0.88被检测到。证据(有证据/无证据)的存在能够以加权平均F1分数0.80被检测到。我们认为这些分数足够高,可用于基于布鲁姆学习分类法的条件对话结构中;并通过论证展示了一个示例条件对话结构,该结构使我们能够进行连贯的学习对话。在我们所描述的实验中,我们展示了图尔敏论证模型如何用于识别论证中的结构问题,我们还讨论了图尔敏论证模型如何能与特别是证据成分正确性的内容性评估结合使用,以识别论证中更复杂的错误类型,即论证成分未很好对齐的情况。由于论证挖掘和对话代理方面取得了进展,接下来的挑战可能是开发支持学习论证的代理。这些代理能够识别论证中因论证成分之间错误关联而产生的更复杂的错误类型。