Suppr超能文献

教学目标一致可创建可靠的自动完形测试。

Pedagogically Aligned Objectives Create Reliable Automatic Cloze Tests.

作者信息

Ondov Brian, Demner-Fushman Dina, Attal Kush

机构信息

National Library of Medicine, Bethesda, MD, USA.

NYU Grossman School of Medicine, New York, NY, USA.

出版信息

Proc Conf. 2024 Jun;2024:3961-3972. doi: 10.18653/v1/2024.naacl-long.220.

Abstract

The cloze training objective of Masked Language Models makes them a natural choice for generating plausible distractors for human cloze questions. However, distractors must also be both distinct and incorrect, neither of which is directly addressed by existing neural methods. Evaluation of recent models has also relied largely on automated metrics, which cannot demonstrate the reliability or validity of human comprehension tests. In this work, we first formulate the pedagogically motivated objectives of plausibility, incorrectness, and distinctiveness in terms of conditional distributions from language models. Second, we present an unsupervised, interpretable method that uses these objectives to jointly optimize sets of distractors. Third, we test the reliability and validity of the resulting cloze tests compared to other methods with human participants. We find our method has stronger correlation with teacher-created comprehension tests than the state-of-the-art neural method and is more internally consistent. Our implementation is freely available and can quickly create a multiple choice cloze test from any given passage.

摘要

掩码语言模型的完形填空训练目标使其成为为人类完形填空问题生成合理干扰项的自然选择。然而,干扰项还必须既独特又错误,而现有神经方法均未直接解决这两个问题。对近期模型的评估也在很大程度上依赖于自动化指标,而这些指标无法证明人类理解测试的可靠性或有效性。在这项工作中,我们首先根据语言模型的条件分布,阐述了在合理性、错误性和独特性方面具有教学动机的目标。其次,我们提出了一种无监督的、可解释的方法,该方法使用这些目标来联合优化干扰项集。第三,我们与其他针对人类参与者的方法相比,测试了由此产生的完形填空测试的可靠性和有效性。我们发现,与最先进的神经方法相比,我们的方法与教师创建的理解测试具有更强的相关性,并且内部一致性更高。我们的实现是免费提供的,并且可以从任何给定的段落快速创建多项选择完形填空测试。

相似文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验