Nicodemus Kristin K, Elvevåg Brita, Foltz Peter W, Rosenstein Mark, Diaz-Asper Catherine, Weinberger Daniel R
Neuropsychiatric Genetics Group, Department of Psychiatry, Trinity Centre for Health Sciences, Trinity College Dublin, St James Hospital, Dublin, Ireland.
Psychiatry Research Group, Department of Clinical Medicine, University of Tromsø, Norway; Norwegian Centre for Integrated Care and Telemedicine (NST), University Hospital of North Norway, Tromsø, Norway.
Cortex. 2014 Jun;55:182-91. doi: 10.1016/j.cortex.2013.12.004. Epub 2013 Dec 20.
Category fluency is a widely used task that relies on multiple neurocognitive processes and is a sensitive assay of cortical dysfunction, including in schizophrenia. The test requires naming of as many words belonging to a certain category (e.g., animals) as possible within a short period of time. The core metrics are the overall number of words produced and the number of errors, namely non-members generated for a target category. We combine a computational linguistic approach with a candidate gene approach to examine the genetic architecture of this traditional fluency measure.
In addition to the standard metric of overall word count, we applied a computational approach to semantics, Latent Semantic Analysis (LSA), to analyse the clustering pattern of the categories generated, as it likely reflects the search in memory for meanings. Also, since fluency performance probably also recruits verbal learning and recall processes, we included two standard measures of this cognitive process: the Wechsler Memory Scale and California Verbal Learning Test (CVLT). To explore the genetic architecture of traditional and LSA-derived fluency measures we employed a candidate gene approach focused on SNPs with known function that were available from a recent genome-wide association study (GWAS) of schizophrenia. The selected candidate genes were associated with language and speech, verbal learning and recall processes, and processing speed. A total of 39 coding SNPs were included for analysis in 665 subjects.
Given the modest sample size, the results should be regarded as exploratory and preliminary. Nevertheless, the data clearly illustrate how extracting the meaning from participants' responses, by analysing the actual content of words, generates useful and neurocognitively viable metrics. We discuss three replicated SNPs in the genes ZNF804A, DISC1 and KIAA0319, as well as the potential for computational analyses of linguistic and textual data in other genomics tasks.
类别流畅性是一项广泛使用的任务,它依赖于多种神经认知过程,是皮质功能障碍的敏感检测方法,包括在精神分裂症中。该测试要求在短时间内尽可能多地说出属于某个特定类别的单词(例如动物)。核心指标是生成的单词总数和错误数量,即针对目标类别生成的非成员单词数量。我们将计算语言学方法与候选基因方法相结合,以研究这种传统流畅性测量的遗传结构。
除了单词总数的标准指标外,我们应用了一种语义计算方法,即潜在语义分析(LSA),来分析生成的类别的聚类模式,因为它可能反映了在记忆中对意义的搜索。此外,由于流畅性表现可能还涉及言语学习和回忆过程,我们纳入了该认知过程的两项标准测量方法:韦氏记忆量表和加利福尼亚言语学习测试(CVLT)。为了探索传统和基于LSA的流畅性测量的遗传结构,我们采用了一种候选基因方法,重点关注来自最近精神分裂症全基因组关联研究(GWAS)的具有已知功能的单核苷酸多态性(SNP)。所选的候选基因与语言和言语、言语学习和回忆过程以及处理速度有关。总共纳入了39个编码SNP,对665名受试者进行分析。
鉴于样本量适中,结果应被视为探索性和初步性的。然而,数据清楚地说明了通过分析单词的实际内容从参与者的反应中提取意义如何产生有用且在神经认知上可行的指标。我们讨论了ZNF804A、DISC1和KIAA0319基因中的三个重复SNP,以及在其他基因组学任务中对语言和文本数据进行计算分析的潜力。