Suppr超能文献

用于临床语言评估的基于图的词对齐

Graph-Based Word Alignment for Clinical Language Evaluation.

作者信息

Prud'hommeaux Emily, Roark Brian

机构信息

Rochester Institute of Technology, College of Liberal Arts, 92 Lomb Memorial Dr., Rochester, NY 14623.

Google, Inc., 1001 SW Fifth Avenue, Suite 1100, Portland OR 97204.

出版信息

Comput Linguist Assoc Comput Linguist. 2015 Dec;41(4):549-578. doi: 10.1162/coli_a_00232. Epub 2015 Dec 1.

Abstract

Among the more recent applications for natural language processing algorithms has been the analysis of spoken language data for diagnostic and remedial purposes, fueled by the demand for simple, objective, and unobtrusive screening tools for neurological disorders such as dementia. The automated analysis of narrative retellings in particular shows potential as a component of such a screening tool since the ability to produce accurate and meaningful narratives is noticeably impaired in individuals with dementia and its frequent precursor, mild cognitive impairment, as well as other neurodegenerative and neurodevelopmental disorders. In this article, we present a method for extracting narrative recall scores automatically and highly accurately from a word-level alignment between a retelling and the source narrative. We propose improvements to existing machine translation-based systems for word alignment, including a novel method of word alignment relying on random walks on a graph that achieves alignment accuracy superior to that of standard expectation maximization-based techniques for word alignment in a fraction of the time required for expectation maximization. In addition, the narrative recall score features extracted from these high-quality word alignments yield diagnostic classification accuracy comparable to that achieved using manually assigned scores and significantly higher than that achieved with summary-level text similarity metrics used in other areas of NLP. These methods can be trivially adapted to spontaneous language samples elicited with non-linguistic stimuli, thereby demonstrating the flexibility and generalizability of these methods.

摘要

在自然语言处理算法的最新应用中,有一项是对口语数据进行分析,以用于诊断和补救目的。这一应用受到了对痴呆症等神经系统疾病的简单、客观且不引人注意的筛查工具的需求的推动。特别是对叙述复述的自动分析显示出作为此类筛查工具的一个组成部分的潜力,因为痴呆症患者以及其常见的前驱症状——轻度认知障碍,以及其他神经退行性和神经发育障碍患者产生准确且有意义叙述的能力会明显受损。在本文中,我们提出了一种从复述与源叙述之间的词级对齐中自动且高精度地提取叙述回忆分数的方法。我们对现有的基于机器翻译的词对齐系统提出了改进,包括一种基于图上随机游走的新颖词对齐方法,该方法在期望最大化所需时间的一小部分内就能实现优于基于标准期望最大化的词对齐技术的对齐精度。此外,从这些高质量词对齐中提取的叙述回忆分数特征所产生的诊断分类精度与使用人工分配分数所达到的精度相当,并且显著高于自然语言处理其他领域中使用的摘要级文本相似性度量所达到的精度。这些方法可以很容易地适用于由非语言刺激引发的自发语言样本,从而证明了这些方法的灵活性和通用性。

相似文献

1
Graph-Based Word Alignment for Clinical Language Evaluation.
Comput Linguist Assoc Comput Linguist. 2015 Dec;41(4):549-578. doi: 10.1162/coli_a_00232. Epub 2015 Dec 1.
2
Automatically Detecting Failures in Natural Language Processing Tools for Online Community Text.
J Med Internet Res. 2015 Aug 31;17(8):e212. doi: 10.2196/jmir.4612.
5
Challenges in clinical natural language processing for automated disorder normalization.
J Biomed Inform. 2015 Oct;57:28-37. doi: 10.1016/j.jbi.2015.07.010. Epub 2015 Jul 14.
6
A comparison of word embeddings for the biomedical natural language processing.
J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.
7
From narrative descriptions to MedDRA: automagically encoding adverse drug reactions.
J Biomed Inform. 2018 Aug;84:184-199. doi: 10.1016/j.jbi.2018.07.001. Epub 2018 Jul 4.
10
A word-oriented approach to alignment validation.
Bioinformatics. 2005 May 15;21(10):2230-9. doi: 10.1093/bioinformatics/bti335. Epub 2005 Feb 22.

引用本文的文献

2
A machine learning-based linguistic battery for diagnosing mild cognitive impairment due to Alzheimer's disease.
PLoS One. 2020 Mar 5;15(3):e0229460. doi: 10.1371/journal.pone.0229460. eCollection 2020.
3
Predicting MCI Status From Multimodal Language Data Using Cascaded Classifiers.
Front Aging Neurosci. 2019 Aug 2;11:205. doi: 10.3389/fnagi.2019.00205. eCollection 2019.
4
Deep language space neural network for classifying mild cognitive impairment and Alzheimer-type dementia.
PLoS One. 2018 Nov 7;13(11):e0205636. doi: 10.1371/journal.pone.0205636. eCollection 2018.
5
Predicting probable Alzheimer's disease using linguistic deficits and biomarkers.
BMC Bioinformatics. 2017 Jan 14;18(1):34. doi: 10.1186/s12859-016-1456-0.
6
Clinical Natural Language Processing in 2015: Leveraging the Variety of Texts of Clinical Interest.
Yearb Med Inform. 2016 Nov 10(1):234-239. doi: 10.15265/IY-2016-049.

本文引用的文献

1
Automatic detection of pragmatic deficits in children with autism.
Workshop Child Comput Interact. 2012 Sep 14;2012:1-6.
2
Automated classification of primary progressive aphasia subtypes from narrative speech transcripts.
Cortex. 2014 Jun;55:43-60. doi: 10.1016/j.cortex.2012.12.006. Epub 2012 Dec 21.
3
Spoken Language Derived Measures for Detecting Mild Cognitive Impairment.
IEEE Trans Audio Speech Lang Process. 2011 Sep 1;19(7):2081-2090. doi: 10.1109/TASL.2011.2112351.
4
Clinical practice. Mild cognitive impairment.
N Engl J Med. 2011 Jun 9;364(23):2227-34. doi: 10.1056/NEJMcp0910237.
5
Relationship of dementia screening tests with biomarkers of Alzheimer's disease.
Brain. 2010 Nov;133(11):3290-300. doi: 10.1093/brain/awq204. Epub 2010 Sep 7.
7
Validity of the MoCA and MMSE in the detection of MCI and dementia in Parkinson disease.
Neurology. 2009 Nov 24;73(21):1738-45. doi: 10.1212/WNL.0b013e3181c34b47.
8
Neuropathology of nondemented aging: presumptive evidence for preclinical Alzheimer disease.
Neurobiol Aging. 2009 Jul;30(7):1026-36. doi: 10.1016/j.neurobiolaging.2009.04.002. Epub 2009 Apr 18.
9
Prevalence of cognitive impairment without dementia in the United States.
Ann Intern Med. 2008 Mar 18;148(6):427-34. doi: 10.7326/0003-4819-148-6-200803180-00005.
10
Frequency and course of mild cognitive impairment in a multiethnic community.
Ann Neurol. 2008 Apr;63(4):494-506. doi: 10.1002/ana.21326.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验