Suppr超能文献

探索受监督主题模型下考生对构造反应项目的反应。

Exploring examinees' responses to constructed response items with a supervised topic model.

机构信息

Kaiser Permanente Mid-Atlantic Permanente Research Institute, Rockville, Maryland, USA.

University of Georgia, Athens, Georgia, USA.

出版信息

Br J Math Stat Psychol. 2024 Feb;77(1):130-150. doi: 10.1111/bmsp.12319. Epub 2023 Sep 13.

Abstract

Textual data are increasingly common in test data as many assessments include constructed response (CR) items as indicators of participants' understanding. The development of techniques based on natural language processing has made it possible for researchers to rapidly analyse large sets of textual data. One family of statistical techniques for this purpose are probabilistic topic models. Topic modelling is a technique for detecting the latent topic structure in a collection of documents and has been widely used to analyse texts in a variety of areas. The detected topics can reveal primary themes in the documents, and the relative use of topics can be useful in investigating the variability of the documents. Supervised latent Dirichlet allocation (SLDA) is a popular topic model in that family that jointly models textual data and paired responses such as could occur with participants' textual answers to CR items and their rubric-based scores. SLDA has an assumption of a homogeneous relationship between textual data and paired responses across all documents. This approach, while useful for some purposes, may not be satisfied for situations in which a population has subgroups that have different relationships. In this study, we introduce a new supervised topic model that incorporates finite-mixture modelling into the SLDA. This new model can detect latent groups of participants that have different relationships between their textual responses and associated scores. The model is illustrated with an example from an analysis of a set of textual responses and paired scores from a middle grades assessment of science inquiry knowledge. A simulation study is presented to investigate the performance of the proposed model under practical testing conditions.

摘要

文本数据在测试数据中越来越常见,因为许多评估都包含构造性反应 (CR) 项目作为参与者理解的指标。基于自然语言处理的技术发展使得研究人员能够快速分析大量的文本数据。为此目的的一类统计技术是概率主题模型。主题建模是一种用于检测文档集合中潜在主题结构的技术,已广泛用于分析各种领域的文本。检测到的主题可以揭示文档中的主要主题,并且主题的相对使用可以用于研究文档的可变性。有监督潜在狄利克雷分配 (SLDA) 是该家族中一种流行的主题模型,它联合建模了文本数据和配对响应,例如参与者对 CR 项目的文本回答及其基于评分标准的分数。SLDA 假设所有文档中,文本数据和配对响应之间的关系都是同质的。这种方法虽然对于某些目的很有用,但对于某些情况下,群体可能存在具有不同关系的亚组,可能并不满足。在这项研究中,我们引入了一种新的有监督主题模型,该模型将有限混合模型纳入 SLDA 中。该新模型可以检测参与者的潜在群体,这些群体的文本响应与其相关分数之间存在不同的关系。该模型通过对一组文本响应和配对分数的分析示例进行说明,该示例来自对科学探究知识的中级评估的文本响应和配对分数。进行了一项模拟研究,以调查在实际测试条件下提出的模型的性能。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验