Lau Joseph Cy, Landau Emily, Zeng Qingcheng, Zhang Ruichun, Crawford Stephanie, Voigt Rob, Losh Molly
Northwestern University, USA.
Autism. 2025 May;29(5):1346-1358. doi: 10.1177/13623613241304488. Epub 2024 Dec 20.
Many individuals with autism experience challenges using language in social contexts (i.e., pragmatic language). Characterizing and understanding pragmatic variability is important to inform intervention strategies and the etiology of communication challenges in autism; however, current manual coding-based methods are often time and labor intensive, and not readily applied in ample sample sizes. This proof-of-concept methodological study employed an artificial intelligence pre-trained language model, Bidirectional Encoder Representations from Transformers, as a tool to address such challenges. We applied Bidirectional Encoder Representations from Transformers to computationally index pragmatic-related variability in autism and in genetically related phenotypes displaying pragmatic differences, namely, in parents of autistic individuals, fragile X syndrome, and premutation. Findings suggest that without model fine-tuning, Bidirectional Encoder Representations from Transformers's Next Sentence Prediction module was able to derive estimates that differentiate autistic from non-autistic groups. Moreover, such computational estimates correlated with manually coded characterization of pragmatic abilities that contribute to conversational coherence, not only in autism but also in the other genetically related phenotypes. This study represents a step forward in evaluating the efficacy of artificial intelligence language models for capturing clinically important pragmatic differences and variability related to autism, showcasing the potential of artificial intelligence to provide automatized, efficient, and objective tools for pragmatic characterization to help advance the field.Lay abstractAutism is clinically defined by challenges with social language, including difficulties offering on-topic language in a conversation. Similar differences are also seen in genetically related conditions such as fragile X syndrome (FXS), and even among those carrying autism-related genes who do not have clinical diagnoses (e.g., the first-degree relatives of autistic individuals and carriers of the premutation), which suggests there are genetic influences on social language related to the genes involved in autism. Characterization of social language is therefore important for informing potential intervention strategies and understanding the causes of communication challenges in autism. However, current tools for characterizing social language in both clinical and research settings are very time and labor intensive. In this study, we test an automized computational method that may address this problem. We used a type of artificial intelligence known as pre-trained language model to measure aspects of social language in autistic individuals and their parents, non-autistic comparison groups, and individuals with FXS and the premutation. Findings suggest that these artificial intelligence approaches were able to identify differences in social language in autism, and to provide insight into the individuals' ability to keep a conversation on-topic. These findings also were associated with broader measures of participants' social communication ability. This study is one of the first to use artificial intelligence models to capture important differences in social language in autism and genetically related groups, demonstrating how artificial intelligence might be used to provide automatized, efficient, and objective tools for language characterization.
许多自闭症患者在社交情境中使用语言时会遇到困难(即语用语言)。对语用变异性进行表征和理解,对于为自闭症干预策略和沟通障碍的病因提供信息非常重要;然而,当前基于人工编码的方法通常既耗时又费力,且不易应用于大量样本。这项概念验证方法学研究采用了一种人工智能预训练语言模型——来自Transformer的双向编码器表征(Bidirectional Encoder Representations from Transformers,BERT),作为应对此类挑战的工具。我们应用BERT来对自闭症以及显示出语用差异的基因相关表型(即自闭症患者的父母、脆性X综合征和前突变携带者)中的语用相关变异性进行计算索引。研究结果表明,在没有对模型进行微调的情况下,BERT的下一句预测模块能够得出区分自闭症组和非自闭症组的估计值。此外,这种计算估计值与有助于对话连贯性的语用能力的人工编码表征相关,不仅在自闭症中如此,在其他基因相关表型中也是如此。这项研究在评估人工智能语言模型捕捉与自闭症相关的临床重要语用差异和变异性的功效方面向前迈进了一步,展示了人工智能为语用表征提供自动化、高效和客观工具以推动该领域发展的潜力。
摘要
自闭症在临床上的定义是社交语言方面存在挑战,包括在对话中难以提供与主题相关的语言。在脆性X综合征(FXS)等基因相关疾病中,甚至在那些携带自闭症相关基因但未被临床诊断的人群(例如自闭症患者的一级亲属和前突变携带者)中也观察到了类似的差异,这表明存在与自闭症相关基因有关的对社交语言的遗传影响。因此,社交语言的表征对于为潜在的干预策略提供信息以及理解自闭症沟通障碍的原因非常重要。然而,目前在临床和研究环境中用于表征社交语言的工具都非常耗时且费力。在本研究中,我们测试了一种可能解决这一问题的自动化计算方法。我们使用了一种称为预训练语言模型的人工智能类型,来测量自闭症个体及其父母、非自闭症对照组、FXS患者和前突变携带者的社交语言方面。研究结果表明,这些人工智能方法能够识别出自闭症患者社交语言中的差异,并深入了解个体保持对话围绕主题的能力。这些发现还与参与者更广泛的社交沟通能力指标相关。这项研究是首批使用人工智能模型来捕捉自闭症和基因相关群体社交语言重要差异的研究之一,展示了如何利用人工智能为语言表征提供自动化、高效和客观的工具。